Lovely tool. Thanks for the release! I had some fun with it before calling it a day.
Here is my personal wishlist after a short test-drive.
- Blame-based stats. While it is nice to see an overview of the historical contributions of Bob and Alice, this is not something that I would use on a daily basis. What would be more useful, is to present the same tables based on the blame lines of a tree-ish. This would show the de-facto "owner(s)" of modules/files, something which comes handy when asking for help with something or even assigning reviews. One could also run this iteratively over the history and get some nice timeline graph.
- Support for pattern-based inclusions/exclusions. E.g. I am not interested to see stats on the json files used by tests. Or any kind of auto-generated files (e.g. django migrations).
- Support for a configuration file, to store your preferred settings in your git repo. Something TOML-based perhaps.
- Better packaging (nit). E.g. the linux tarball for v0.6 contains some apple-related "junk" and gnu tar complains about archive format incompatibilities.
For a low-tech version of this, I have long had an alias (which I call "nerdwars") to "git shortlog -ns --no-merges" which just gives the number of commits by contributor from most to least. It's a good way to get a sense for who the major contributors in a project are.
'git blame' is named after the subversion and CVS blame features that do the same thing. Subversion docs are clear that it's a snarky name and that 'svn praise' and 'svn annotate' are neutral synonyms.
Perhaps someone familiar with CVS can comment on its history there since it seems to be the first source control to add it.
EDIT: and one of the main reasons it's a useful feature is it tells you who to talk to to understand a piece of code, or to coordinate a roll back, or to do any other sort of communication. It probably matters more in a big company where code is changing frequently and you're unlikely to know everyone and what they're working on.
> Subversion docs are clear that it's a snarky name and that 'svn praise' and 'svn annotate' are neutral synonyms.
A few years ago, some Atlassian developer changed "Blame" in the BitBucket UI to "Annotate". I remember a lot of people being frustrated because they couldn't find "blame" anywhere and the change was never officially announced. It just happened one day
Someone opened a ticket with BitBucket about it which ended up drawing a lot of attention from frustrated users who couldn't find "blame", and their searches for it on Google led people to the ticket. Atlassian eventually responded saying that they made the change because "blame" sounds bad and can hurt people's feelings somehow (with no examples given of course, though ironically the dev who made the change certainly had hurt feelings after the upset masses had some choice words for the short-sighted decision. Though Atlassian doubled down and I believe closed the ticket without reverting the change, so the confusion remains, as far as I know)
I don't think that they ever mentioned the Subversion/CVS parallel that was drawn to choose that name, so it was really confusing why that was selected. But this comment shed some light on that ancient incident
If I ever saw an "annotate" command I'd immediately assume it's for adding notes as metadata outside the actual software versioning tool, not for seeing who wrote the code in question.
Nomenclature matters. Do not reinvent terms just for fun.
> EDIT: and one of the main reasons it's a useful feature is it tells you who to talk to to understand a piece of code, or to coordinate a roll back, or to do any other sort of communication. It probably matters more in a big company where code is changing frequently and you're unlikely to know everyone and what they're working on.
It's actually a pretty awful feature because it misses so much context. I've been blamed before for changes which were technically my fault, but while my code was to spec, some unrelated part of the code I was interacting with was not (iirc it was some multi-threaded nonsense like a race condition or something).
It was a super-stressful week of constantly having to defend my design decisions and white-boarding my thought process (think of the "am I taking crazy pills?!?" scene in Zoolander) as my senior coworker tried to gaslight and throw me under the bus.
Maybe I've had a uniquely bad experience with it, but I've vowed to never use it (as a way to attribute `blame`). Code should be holistically understood and it's your job as a technical leader to know how the parts move, resolve issues without drama, and make sure your whole team is on the same page: this is a cohesive team, not an adversarial dick-measuring contest.
This is why blameless post-mortems and a healthy culture are important. I'm sorry you encountered such a hostile culture. Perhaps "git blame" has the wrong name, but the idea of traceability is still important.
Aviation is the shining example of this, combining high traceability (you should be able to track each part back to the factory and to all the technicians who have worked on it) with accident inquiries that are focused around finding cause and avoiding future risks rather than assigning blame.
Just know that it can also be used in very positive ways.
For instance, I may want to change some basic behavior. Easy enough, spend some time implementing and testing, and then run into a downstream consequence of that change while implementing. Now I need to make a decision. Reviewing the history of the relevant sections of code, using git blame, can help me uncover the context and ways in which the code I'm puzzling about changing has changed previously. This can be incredibly valuable and speed up or even obviate an amount of discussion around the potential change.
I acknowledge that I have no idea what happened in this situation. Please don't take this as me justifying mean behavior.
> Code should be holistically understood and it's your job as a technical leader to know how the parts move
That's true when we design something. Once the design is done, and is broken, we have to tear it back apart to understand WHAT is broken. That's when blame is useful.
I love using git blame. I love it even more when it comes back with something I made, because then I get to learn. When something I thought was safe turned out to break something, that's an invaluable chance to understand the system better.
That being said, I've totally used the blame output to end a series of excuses from a junior about how "his code was definitely right, but everything else was garbage" because I really do not care. If it worked before, but doesn't work now, that's a problem. Part of the modern process of "fail fast" is also to build up taste about which parts of a working system are spooky.
I find that some people take the "blameless" culture too far, and use it as an excuse not to reflect on outcomes. They just ruffle the code whenever there's a big, and don't think critically about why that bug appeared. What that tells us about the system we're making.
I work at a company with a mess of a monorepo but the git history is a gold mine. It's fascinating digging back into history and reading why certain decisions are made, or random pitfalls the author discovered, or context that was missing. It absolutely feels like a bit of a detective mystery trying to dig back and figure out if some line of code is a bug that was meant to do something else, or is functioning as intended and the requirements changed, or something else entirely.
Ofc as my org has gotten bigger, we've lost a lot of the discipline around writing good commit messages so now it's just a mess of large code-changes with 1-line "bugfix" explainations :(
Isn't this sort of an inconsequential point? The commit still has one and only one author and that's almost certainly what I'm looking for so I know who to go ask questions about their code. I also use it to find the commit but less frequently.
At my last workplace, the codebase was about 25 years old, there were three of us, and one of us was the original author. You could simply guess "Gerald wrote this" and you'd be right nine times out of ten. However, it turns out that software developers have finite memory themselves, and svn blame was useful in tracing a line of code back to the original ticket.
Linking a line of code back to the commit is useful even if you can't ask the author about it. It tells you what other lines of code are involved and what the overall purpose is. It's significantly more useful if you can link it into documentation outside the code: ticketing systems, requirements docs, etc.
The main limit to svn blame in that situation was that quite often it would hit commit 1, when the codebase had been imported from Visual SourceSafe.
No, if your commits are meaningful and have plenty of context, like they should, then you are not looking for the author. Instead, you are looking for "why is this here", and the commit should tell you.
And the committer and author don’t even need to be the same!
But the point, as I read it, is: what matters is the context, i.e. if a line is faulty, how did things look like when it wasn’t faulty? The commit’s content is more often more important than the committer, although the committer is useful because you can ask them if they’re still around.
On a small team I usually already know who wrote the code I'm reading, but it's nice to see if a block of code is all from the same point in time, or if some of the lines are the result of later bugfixing. It's also useful to find the associated pull request for a block of code, to see what issues were considered in code review, to know whether something that seems odd was discussed or glossed over when the code was merged in the first place.
I find the GitHub blame view indispensable for this kind of code archeology, as they also give you an easy way to traverse the history of lines of code. In blame, you can go back to the previous revision that changed the line and see the blame for that, and on and on.
I really want to find or build a tool that can automatically traverse history this way, like git-evolve-log.
I've been carrying around a copy of "git blameall" for years - looks like https://github.com/gnddev/git-blameall is the same one - that basically does this, but keeps it all interleaved in one output (which works pretty well for archeology, especially if you're looking at "work hardened" code.)
(Work hardening is a metalworking term where metal bent back and forth (or hammered) too much becomes brittle; an analogous effect shows up in code, where a piece of code that has been bugfixed a couple of times will probably be need more fixes; there was a published result a decade or so back about using this to focus QA efforts...)
I mean, no. If you work on a codebase that's been going for more than a few years, the author likely doesn't even work there anymore. The commit is the important thing.
Frankly the commit message is usually the important thing. I care about why a change happened. Give me a Jira ticket, or a line of reasoning, or some documentation. I need to know this far more often than I care who literally typed the code in the computer.
Unless you get lazy like me and start committing only out of shame once the modified file count reaches close to triple digits or prior to doing very sketchy changes.
BTW, one of the more frustrating things about "git blame" comes about when cleaning up an old codebase: In my current job I had to move a lot of files, combine repos, reformat code, ect, ect.
"git blame" and similar tools often always show my name, even though I didn't write the code.
Most places I worked have a blame.ignoreRevsFile[0] somewhere on the top level to inhibit this. It's a bit awkward because first you need to commit, then you need commit again to update the commit hash in the ignore revs file, but it's great for filtering out pure refactor churn.
Bisect shows which commit is responsible for a yes/no behavior change. Blame shows which commit is responsible for a line of code. Both are useful for finding the responsible commit but for different things.
Sometimes you don't know which commit actually caused the problem.
e.g., you realize that something broke A/B test logic on Friday. Sure, there are Jira tickets, but that's slow and annoying to dig through. There are commit messages, but things get squashed, etc. Plus, if you work in a monorepo with about 60 PRs a day, it's hard to know if it was your code or an associated library someone touched.
That's exactly when git bisect helps. It quickly narrows down which commit introduced the bug when you don't know where to start looking. Once bisect identifies the problematic commit, you can then use git blame (if needed) to see who made those specific changes.
Edit: Cleaned up what I was saying to hopefully avoid confusion.
I'm not quite sure what you're saying, but blame just tells you the last person and commit to change a line.
If you want to know which commit actually caused a problem you would use bisect. That may be what you're saying, but it sounded a bit like you are saying blame is better for tracking the culprit commit.
I am guessing it only resorts to that expansion if it dosesn't _already_ know about the command, because $(printf '#!/bin/sh\necho pwned\n' > /bin/git-status; chmod 755 /bin/git-status; git status) results in the thing happening that you'd expect, not a mysterious message
FWIW, both brew and kubectl also have adopted this behavior (of $(basename)-plugin style verb extensions) so I find it unlikely they'd all do it if it was a straight-up facepalm
probably adding a confirmation message the first time the alias is used for each command would be good, it would be nice to know when i'm invoking git and when i'm invoking a third party binary regardless of any exploit attempts!
This is great. I do this sort of git-blame accounting to track how much code is written by AI versus humans in each release of my app.
My "blame script" has been slowing down as the repo size increases. I was just about to add caching, like you have.
Have you thought about adding the ability to limit the stats based on a set of file patterns? Perhaps like this, where the file follows gitignore conventions?
Git natively supports excludes in all pathspecs, e.g. `git log -- ':!generated/'` to exclude files in the `generated/` folder from showing up in the log.
If you have a shell that supports extended globbing, you could do something like:
$ git who table */**/*.go
That works for me using Bash. I believe all that's happening here is that Bash is expanding the globs and passing a long list of individual filepaths as arguments to git who. Git who then passes them to git log so that it only tallies the commits you'd get by running:
I don't think the comparison to git blame is needed/warranted. While the 'blame' in git blame suggests the tool is about identifying authors its main purpose is to identify commits so that you can find out the context of why something was changed. Instead this tool seems to be a fancy `git shortlog -sn`.
I like it. A problem I had right away is some people commit using two different emails. Like one from home computer and one from work computer. Would be nice to be able to define them as the same thing.
You might be able to do that with built-in git functionality called gitmailmap. It is basically a file where you can map multiple names and emails to the same one.
Things I'm missing in git are not how many lines or commits given developer did, which might lead in a poorly managed organisation to strangely calculated KPIs, but rather:
- who deleted this line (which one?)
- who is owner of this method (some guy refactored it or reformatted, but who is the REAL owner, or what was the history of this method)
> - who is owner of this method (some guy refactored it or reformatted, but who is the REAL owner, or what was the history of this method)
It doesn't work perfectly, but with magit you can jump to the revision before the refactor/reformat, then do another blame from there. I chased a line of code through several layers of refactors that way before and while the original author was long gone it did help explain why things were initially done that way.
I heavily depend on git-blame to understand code. It's one reason why I generally dislike "cleanup" changes that just change formatting/naming for the sake of it.
This looks like an almost pure Golang program, but still has a Ruby dependency. Is there a component/library that Ruby provides, that Go doesn't have? Or some other logic going on?
Cool stuff. I like using Git via the CLI, but when it comes to blame, I simply use the preview UI of the VSCode GitLens extension. It takes half a second to launch it from the command palette and inspect the blame.
I have some VS Code extension (errr... not sure which) that faintly inlines the git blame result on each line of code you're working on. I find it kind of handy.
Very cool, I'm working on something similar as part of a bigger project (not TUI related). I'm interested in how you did blame caching, will take a look at the implementation. I am trying to do a "forward blame" so that the blame of new commits can be created very quickly. Happy to exchange some thoughts around this!
> This requires that you have Go, Ruby, and the rake Ruby gem installed.
That doesn't cut it for me. git - once built - depends on C libraries and Perl. If you want to add something onto git (that is not specifically targeting Go, or Ruby etc.) - it should not IMNSHO depend on other things.
That doesn't mean you can't write your tool in some modern fashionable language, but eventually you need to bring it down to earth (or rather earth + Perl).
Lovely tool. Thanks for the release! I had some fun with it before calling it a day.
Here is my personal wishlist after a short test-drive.
- Blame-based stats. While it is nice to see an overview of the historical contributions of Bob and Alice, this is not something that I would use on a daily basis. What would be more useful, is to present the same tables based on the blame lines of a tree-ish. This would show the de-facto "owner(s)" of modules/files, something which comes handy when asking for help with something or even assigning reviews. One could also run this iteratively over the history and get some nice timeline graph.
- Support for pattern-based inclusions/exclusions. E.g. I am not interested to see stats on the json files used by tests. Or any kind of auto-generated files (e.g. django migrations).
- Support for a configuration file, to store your preferred settings in your git repo. Something TOML-based perhaps.
- Better packaging (nit). E.g. the linux tarball for v0.6 contains some apple-related "junk" and gnu tar complains about archive format incompatibilities.
I'm very happy to hear you had fun with it. Thank you for the comprehensive feedback and for trying it out!
Not sure what's happening with the tarball. Will take a look at that.
For a low-tech version of this, I have long had an alias (which I call "nerdwars") to "git shortlog -ns --no-merges" which just gives the number of commits by contributor from most to least. It's a good way to get a sense for who the major contributors in a project are.
By the way, git blaming is really misunderstood by a lot of people; its NOT about who did it, its about which commit is to blame -- that's different.
I'm pretty sure it's about who wrote the code.
'git blame' is named after the subversion and CVS blame features that do the same thing. Subversion docs are clear that it's a snarky name and that 'svn praise' and 'svn annotate' are neutral synonyms.
Perhaps someone familiar with CVS can comment on its history there since it seems to be the first source control to add it.
EDIT: and one of the main reasons it's a useful feature is it tells you who to talk to to understand a piece of code, or to coordinate a roll back, or to do any other sort of communication. It probably matters more in a big company where code is changing frequently and you're unlikely to know everyone and what they're working on.
> Subversion docs are clear that it's a snarky name and that 'svn praise' and 'svn annotate' are neutral synonyms.
A few years ago, some Atlassian developer changed "Blame" in the BitBucket UI to "Annotate". I remember a lot of people being frustrated because they couldn't find "blame" anywhere and the change was never officially announced. It just happened one day
Someone opened a ticket with BitBucket about it which ended up drawing a lot of attention from frustrated users who couldn't find "blame", and their searches for it on Google led people to the ticket. Atlassian eventually responded saying that they made the change because "blame" sounds bad and can hurt people's feelings somehow (with no examples given of course, though ironically the dev who made the change certainly had hurt feelings after the upset masses had some choice words for the short-sighted decision. Though Atlassian doubled down and I believe closed the ticket without reverting the change, so the confusion remains, as far as I know)
I don't think that they ever mentioned the Subversion/CVS parallel that was drawn to choose that name, so it was really confusing why that was selected. But this comment shed some light on that ancient incident
If I ever saw an "annotate" command I'd immediately assume it's for adding notes as metadata outside the actual software versioning tool, not for seeing who wrote the code in question.
Nomenclature matters. Do not reinvent terms just for fun.
> ironically the dev who made the change certainly had hurt feelings after the upset masses had some choice words for the short-sighted decision.
Dev probably became the public face for a decision made by someone else (eg. Product owner, TL, whatever the business structure is in Atlassian)
> EDIT: and one of the main reasons it's a useful feature is it tells you who to talk to to understand a piece of code, or to coordinate a roll back, or to do any other sort of communication. It probably matters more in a big company where code is changing frequently and you're unlikely to know everyone and what they're working on.
It's actually a pretty awful feature because it misses so much context. I've been blamed before for changes which were technically my fault, but while my code was to spec, some unrelated part of the code I was interacting with was not (iirc it was some multi-threaded nonsense like a race condition or something).
It was a super-stressful week of constantly having to defend my design decisions and white-boarding my thought process (think of the "am I taking crazy pills?!?" scene in Zoolander) as my senior coworker tried to gaslight and throw me under the bus.
Maybe I've had a uniquely bad experience with it, but I've vowed to never use it (as a way to attribute `blame`). Code should be holistically understood and it's your job as a technical leader to know how the parts move, resolve issues without drama, and make sure your whole team is on the same page: this is a cohesive team, not an adversarial dick-measuring contest.
This is why blameless post-mortems and a healthy culture are important. I'm sorry you encountered such a hostile culture. Perhaps "git blame" has the wrong name, but the idea of traceability is still important.
Aviation is the shining example of this, combining high traceability (you should be able to track each part back to the factory and to all the technicians who have worked on it) with accident inquiries that are focused around finding cause and avoiding future risks rather than assigning blame.
That sounds like an awful workplace culture. I doubt the name of the git command is responsible, though.
Just know that it can also be used in very positive ways.
For instance, I may want to change some basic behavior. Easy enough, spend some time implementing and testing, and then run into a downstream consequence of that change while implementing. Now I need to make a decision. Reviewing the history of the relevant sections of code, using git blame, can help me uncover the context and ways in which the code I'm puzzling about changing has changed previously. This can be incredibly valuable and speed up or even obviate an amount of discussion around the potential change.
I acknowledge that I have no idea what happened in this situation. Please don't take this as me justifying mean behavior.
> Code should be holistically understood and it's your job as a technical leader to know how the parts move
That's true when we design something. Once the design is done, and is broken, we have to tear it back apart to understand WHAT is broken. That's when blame is useful.
I love using git blame. I love it even more when it comes back with something I made, because then I get to learn. When something I thought was safe turned out to break something, that's an invaluable chance to understand the system better.
That being said, I've totally used the blame output to end a series of excuses from a junior about how "his code was definitely right, but everything else was garbage" because I really do not care. If it worked before, but doesn't work now, that's a problem. Part of the modern process of "fail fast" is also to build up taste about which parts of a working system are spooky.
I find that some people take the "blameless" culture too far, and use it as an excuse not to reflect on outcomes. They just ruffle the code whenever there's a big, and don't think critically about why that bug appeared. What that tells us about the system we're making.
I work at a company with a mess of a monorepo but the git history is a gold mine. It's fascinating digging back into history and reading why certain decisions are made, or random pitfalls the author discovered, or context that was missing. It absolutely feels like a bit of a detective mystery trying to dig back and figure out if some line of code is a bug that was meant to do something else, or is functioning as intended and the requirements changed, or something else entirely.
Ofc as my org has gotten bigger, we've lost a lot of the discipline around writing good commit messages so now it's just a mess of large code-changes with 1-line "bugfix" explainations :(
> It's actually a pretty awful feature
Obligatory T-Shirt link
https://www.amazon.com/Blame-Ruining-Friendships-Since-T-Shi...
Isn't this sort of an inconsequential point? The commit still has one and only one author and that's almost certainly what I'm looking for so I know who to go ask questions about their code. I also use it to find the commit but less frequently.
At my last workplace, the codebase was about 25 years old, there were three of us, and one of us was the original author. You could simply guess "Gerald wrote this" and you'd be right nine times out of ten. However, it turns out that software developers have finite memory themselves, and svn blame was useful in tracing a line of code back to the original ticket.
Linking a line of code back to the commit is useful even if you can't ask the author about it. It tells you what other lines of code are involved and what the overall purpose is. It's significantly more useful if you can link it into documentation outside the code: ticketing systems, requirements docs, etc.
The main limit to svn blame in that situation was that quite often it would hit commit 1, when the codebase had been imported from Visual SourceSafe.
No, if your commits are meaningful and have plenty of context, like they should, then you are not looking for the author. Instead, you are looking for "why is this here", and the commit should tell you.
No, commits can be co-authored.
And the committer and author don’t even need to be the same!
But the point, as I read it, is: what matters is the context, i.e. if a line is faulty, how did things look like when it wasn’t faulty? The commit’s content is more often more important than the committer, although the committer is useful because you can ask them if they’re still around.
On a small team I usually already know who wrote the code I'm reading, but it's nice to see if a block of code is all from the same point in time, or if some of the lines are the result of later bugfixing. It's also useful to find the associated pull request for a block of code, to see what issues were considered in code review, to know whether something that seems odd was discussed or glossed over when the code was merged in the first place.
I find the GitHub blame view indispensable for this kind of code archeology, as they also give you an easy way to traverse the history of lines of code. In blame, you can go back to the previous revision that changed the line and see the blame for that, and on and on.
I really want to find or build a tool that can automatically traverse history this way, like git-evolve-log.
I've been carrying around a copy of "git blameall" for years - looks like https://github.com/gnddev/git-blameall is the same one - that basically does this, but keeps it all interleaved in one output (which works pretty well for archeology, especially if you're looking at "work hardened" code.)
(Work hardening is a metalworking term where metal bent back and forth (or hammered) too much becomes brittle; an analogous effect shows up in code, where a piece of code that has been bugfixed a couple of times will probably be need more fixes; there was a published result a decade or so back about using this to focus QA efforts...)
there is https://github.com/emacsmirror/git-timemachine which is really nice if you use emacs.
I mean, no. If you work on a codebase that's been going for more than a few years, the author likely doesn't even work there anymore. The commit is the important thing.
Frankly the commit message is usually the important thing. I care about why a change happened. Give me a Jira ticket, or a line of reasoning, or some documentation. I need to know this far more often than I care who literally typed the code in the computer.
git bisect is about which commit is to blame for a reproducible problem.
git blame is about which author most recently touched each line (in what commit); i.e. is to "blame" for that line having its current content.
You're right in that git blame is most useful for finding which commit touched a line. What was done in the commit is more important than who did it.
git blame is very useful even in a solo project where you already know that you wrote every commit.
Unless you get lazy like me and start committing only out of shame once the modified file count reaches close to triple digits or prior to doing very sketchy changes.
BTW, one of the more frustrating things about "git blame" comes about when cleaning up an old codebase: In my current job I had to move a lot of files, combine repos, reformat code, ect, ect.
"git blame" and similar tools often always show my name, even though I didn't write the code.
Most places I worked have a blame.ignoreRevsFile[0] somewhere on the top level to inhibit this. It's a bit awkward because first you need to commit, then you need commit again to update the commit hash in the ignore revs file, but it's great for filtering out pure refactor churn.
[0] https://git-scm.com/docs/git-blame#Documentation/git-blame.t...
I have seen `git blame` used to blame specific people. I've seen it work. Some of those people deserved some blame.
The manpage explains what the command does. How and why it's used is up to the user.
Alternatively git praise https://github.com/ansman/git-praise
Why wouldn’t you just set an alias in your global git config for this?
You can't put an alias on your CV.
Now _that_ is one of the best two-liners I've ever seen!
Ideally, you find the context for why a change was made
Many people enjoy pretending that the term "blame" in "git blame" is not a funny little programmer joke that we don't have to be upset about.
It's funny because it always ends up telling me I did it.
One of the most delightful outcomes of having the silly name!
True.
No. You are thinking of git bisect
Bisect shows which commit is responsible for a yes/no behavior change. Blame shows which commit is responsible for a line of code. Both are useful for finding the responsible commit but for different things.
No, bisect is not blame but for commits. Blame shows you which COMMIT is to blame. That's my point.
Isn't that what git bisect is for?
Sometimes you don't know which commit actually caused the problem.
e.g., you realize that something broke A/B test logic on Friday. Sure, there are Jira tickets, but that's slow and annoying to dig through. There are commit messages, but things get squashed, etc. Plus, if you work in a monorepo with about 60 PRs a day, it's hard to know if it was your code or an associated library someone touched.
That's exactly when git bisect helps. It quickly narrows down which commit introduced the bug when you don't know where to start looking. Once bisect identifies the problematic commit, you can then use git blame (if needed) to see who made those specific changes.
Edit: Cleaned up what I was saying to hopefully avoid confusion.
I'm not quite sure what you're saying, but blame just tells you the last person and commit to change a line.
If you want to know which commit actually caused a problem you would use bisect. That may be what you're saying, but it sounded a bit like you are saying blame is better for tracking the culprit commit.
Bisect is for finding behavior changes in O(log n) operations. Blame is for finding the last change to a line in one operation.
> You can invoke git-who as git who by setting up an alias in your global Git config
This works even without the alias, by the way: by default `git whatever` will search your path for `git-whatever` and execute it.
Wow! I had no idea. Will need to update the README. Thanks for the tip!
Has this behavior been the source of exploits in the past? Something about it feels dangerously presumptuous to me.
I am guessing it only resorts to that expansion if it dosesn't _already_ know about the command, because $(printf '#!/bin/sh\necho pwned\n' > /bin/git-status; chmod 755 /bin/git-status; git status) results in the thing happening that you'd expect, not a mysterious message
FWIW, both brew and kubectl also have adopted this behavior (of $(basename)-plugin style verb extensions) so I find it unlikely they'd all do it if it was a straight-up facepalm
probably adding a confirmation message the first time the alias is used for each command would be good, it would be nice to know when i'm invoking git and when i'm invoking a third party binary regardless of any exploit attempts!
If malicious code ends up in your $PATH you have much bigger problems than git having a seamless plugin architecture.
For anyone not aware of it: `tig` is a really cool TUI git frontend, and it has a beautiful `tig blame` sub command..
This is great. I do this sort of git-blame accounting to track how much code is written by AI versus humans in each release of my app.
My "blame script" has been slowing down as the repo size increases. I was just about to add caching, like you have.
Have you thought about adding the ability to limit the stats based on a set of file patterns? Perhaps like this, where the file follows gitignore conventions?
I tried to quickly add this functionality but unfortunately I don't know go.Git natively supports excludes in all pathspecs, e.g. `git log -- ':!generated/'` to exclude files in the `generated/` folder from showing up in the log.
That's a neat idea, I can see how it'd be useful.
If you have a shell that supports extended globbing, you could do something like:
That works for me using Bash. I believe all that's happening here is that Bash is expanding the globs and passing a long list of individual filepaths as arguments to git who. Git who then passes them to git log so that it only tallies the commits you'd get by running:Yup. It’s a complex enough set of in/excludes that I think that would get unwieldy for my use case.
Details here:
https://github.com/Aider-AI/aider/blob/main/scripts/blame.py
Again, nice work on your tool. I’ll spend some more time trying to harness it for my need.
Thank you very much!
> $ git who table */**/*.go
I might have my globbing syntax wrong, but I think that `*/**/*.go` is the same as `**/*.go` unless you have `*.go` files in the working directory.
I don't think the comparison to git blame is needed/warranted. While the 'blame' in git blame suggests the tool is about identifying authors its main purpose is to identify commits so that you can find out the context of why something was changed. Instead this tool seems to be a fancy `git shortlog -sn`.
I like it. A problem I had right away is some people commit using two different emails. Like one from home computer and one from work computer. Would be nice to be able to define them as the same thing.
You might be able to do that with built-in git functionality called gitmailmap. It is basically a file where you can map multiple names and emails to the same one.
It's great when you read further down the comments and come across a gem like this. I had no idea this was possible; thanks.
Like other commenters have said, mailmap does this and git who will respect your mailmap file.
Like the other comment, this is "I came looking for copper but found gold" moment for me. Thanks!
That's what the mailmap is for.
"This requires that you have Go, Ruby, and the rake Ruby gem installed." - sticking to the binaries then :) Cool little project, will try it tomorrow!
Things I'm missing in git are not how many lines or commits given developer did, which might lead in a poorly managed organisation to strangely calculated KPIs, but rather:
Yes, exactly.
That is what I use git for each and every day.
> - who is owner of this method (some guy refactored it or reformatted, but who is the REAL owner, or what was the history of this method)
It doesn't work perfectly, but with magit you can jump to the revision before the refactor/reformat, then do another blame from there. I chased a line of code through several layers of refactors that way before and while the original author was long gone it did help explain why things were initially done that way.
I heavily depend on git-blame to understand code. It's one reason why I generally dislike "cleanup" changes that just change formatting/naming for the sake of it.
I like this one the best: https://github.com/jayphelps/git-blame-someone-else
I look forward to the inevitable upgraded version, “git whom”.
wow that is actually hilarious! XD
I would love to see this get a brew release.
it's ready!
Don't tell the higher ups about this stack ranking tool.
This is fantastic! I love it. Pretty quick too...
For a rails codebase that is ~18 years old, has 1695 committers and more than 220,000 commits:
This is neat.
I made a similar powershell script recently but reverse search from filename to find the authors by commits.
Thank you! My consumer-scale git blaming was leaving me with woeful feeling of inadequacy.
This looks like an almost pure Golang program, but still has a Ruby dependency. Is there a component/library that Ruby provides, that Go doesn't have? Or some other logic going on?
I’m pretty sure it’s just using Ruby for rake, which is a task runner. So Ruby is only needed for the build process.
Cool stuff. I like using Git via the CLI, but when it comes to blame, I simply use the preview UI of the VSCode GitLens extension. It takes half a second to launch it from the command palette and inspect the blame.
I have some VS Code extension (errr... not sure which) that faintly inlines the git blame result on each line of code you're working on. I find it kind of handy.
That was actually recently added as a built-in feature
https://code.visualstudio.com/updates/v1_97#_git-blame-infor...
Like the sibling comment, I didn't want to run all of GitLens just for it, but now that it's a built-in I've also been finding it quite useful.
GitLens https://marketplace.visualstudio.com/items?itemName=eamodio....
I uninstalled it, I seem to recall it impacting the speed of VS Code a good bit.
I love TUI tools but I'm not too familiar with Golang, now I am thinking I should start looking into using go for TUIs
this is a great tool!
At pico.sh we have been experimenting with TUIs and remote clis successfully for a few years, you can see how we build our ssh tui app here: https://github.com/picosh/pico/tree/main/pkg/tui
I've been using `git summary` from tj/git-extras for a while. It seems to do a similar job.
Nice one, works better than mine;
I've been using a git alias for quite some time
`lead = shortlog -s -n --all --no-merges`
Very cool, I'm working on something similar as part of a bigger project (not TUI related). I'm interested in how you did blame caching, will take a look at the implementation. I am trying to do a "forward blame" so that the blame of new commits can be created very quickly. Happy to exchange some thoughts around this!
Cool, but how do I increase the number of rows? Is it always just the top ten?
The -n flag does this. Use -n 0 to show all rows
I love CLI tools, plus it is open source!
https://shortlog.io/
Cool stuff!
Tiny prototype implementation:
Run on log from GNU Bison. We anonymize names so that search engines don't index this comment to those names:
Code:Followed the link, and the README said:
> This requires that you have Go, Ruby, and the rake Ruby gem installed.
That doesn't cut it for me. git - once built - depends on C libraries and Perl. If you want to add something onto git (that is not specifically targeting Go, or Ruby etc.) - it should not IMNSHO depend on other things.
That doesn't mean you can't write your tool in some modern fashionable language, but eventually you need to bring it down to earth (or rather earth + Perl).
These are all build dependencies. You don't need any of these just to run git who. The language could be clearer; I'll update it.
Ah, ok, great! I'll try it then...