Running Gource with Emacs

Running Gource with Emacs

Gource is a program that draws an animated graph of users changing the repository over time.

Although it can work without extra effort (just run gource in a git repo), there are some tweaks that can be done:

  • Gource supports using custom pictures for users. Gravatar is an obvious place to get these.
  • Occasionally, the same people have different names and/or emails in history.
    It may happen when people use forges like GitLab or just have different settings on different machines. It would be nice to merge these names.
  • Visualizing the history of multiple repositories (e.g. frontend and backend) requires combining multiple gource logs.

So, why not try doing that with Emacs?

Gravatars

Much to my surprise, Emacs turned out to have a built-in package called gravatar.el.

So, let’s make a function to retrieve a gravatar and save it:

(defun my/gravatar-retrieve-sync (email file-name)
  "Get gravatar for EMAIL and save it to FILE-NAME."
  (let ((gravatar-default-image "identicon")
        (gravatar-size nil)
        (coding-system-for-write 'binary)
        (write-region-annotate-functions nil)
        (write-region-post-annotation-function nil))
    (write-region
     (image-property (gravatar-retrieve-synchronously email) :data)
     nil file-name nil :silent)))

To use these images, we need to save them to some folder and use usernames as file names. The folder:

(setq my/gravatar-folder "/home/pavel/.cache/gravatars/")

And the function that downloads a gravatar if necessary:

(defun my/gravatar-save (email author)
  "Download gravatar for EMAIL.

AUTHOR is the username."
  (let ((file-name (concat my/gravatar-folder author ".png")))
    (mkdir my/gravatar-folder t)
    (unless (file-exists-p file-name)
      (message "Fetching gravatar for %s (%s)" author email)
      (my/gravatar-retrieve-sync email file-name))))

Merging authors

Now to merging authors.

Gource itself uses only usernames (without emails), but we can use git log to get both. The required information can be extracted like that:

git log --pretty=format:"%ae|%an" | sort | uniq -c | sed "s/^[ \t]*//;s/ /|/"

The output is a list of pipe-separated strings, where the values are:

  • Number of occurrences for this combination of username and email
  • Email
  • Username

Of course, that part would have to be changed appropriately for other version control systems if you happen to use one.

So, below is one hell of a function that wraps this command and tries to merge emails and usernames belonging to one author:

(defun my/git-get-authors (repo &optional authors-init)
  "Extract and merge all combinations of authors & emails from REPO.

REPO is the path to a git repository.

AUTHORS-INIT is the previous output of `my/git-get-authors'.  It can
be used to extract that information from multiple repositories.

The output is a list of alists with following keys:
- emails: list of (<email> . <count>)
- authors: list of (<username> . <count>)
- email: the most popular email
- author: the most popular username
I.e. one alist is all emails and usernames of one author."
  (let* ((default-directory repo)
         (data (shell-command-to-string
                "git log --pretty=format:\"%ae|%an\" | sort | uniq -c | sed \"s/^[ \t]*//;s/ /|/\""))
         (authors
          (cl-loop for string in (split-string data "\n")
                   if (= (length (split-string string "|")) 3)
                   collect (let ((datum (split-string string "|")))
                             `((count . ,(string-to-number (nth 0 datum)))
                               (email . ,(downcase (nth 1 datum)))
                               (author . ,(nth 2 datum)))))))
    (mapcar
     (lambda (datum)
       (setf (alist-get 'author datum)
             (car (cl-reduce
                   (lambda (acc author)
                     (if (> (cdr author) (cdr acc))
                         author
                       acc))
                   (alist-get 'authors datum)
                   :initial-value '(nil . -1))))
       (setf (alist-get 'email datum)
             (car (cl-reduce
                   (lambda (acc email)
                     (if (> (cdr email) (cdr acc))
                         email
                       acc))
                   (alist-get 'emails datum)
                   :initial-value '(nil . -1))))
       datum)
     (cl-reduce
      (lambda (acc val)
        (let* ((author (alist-get 'author val))
               (email (alist-get 'email val))
               (count (alist-get 'count val))
               (saved-value
                (seq-find
                 (lambda (cand)
                   (or (alist-get email (alist-get 'emails cand)
                                  nil nil #'string-equal)
                       (alist-get author (alist-get 'authors cand)
                                  nil nil #'string-equal)
                       (alist-get email (alist-get 'authors cand)
                                  nil nil #'string-equal)
                       (alist-get author (alist-get 'emails cand)
                                  nil nil #'string-equal)))
                 acc)))
          (if saved-value
              (progn
                (if (alist-get email (alist-get 'emails saved-value)
                               nil nil #'string-equal)
                    (cl-incf (alist-get email (alist-get 'emails saved-value)
                                        nil nil #'string-equal)
                             count)
                  (push (cons email count) (alist-get 'emails saved-value)))
                (if (alist-get author (alist-get 'authors saved-value)
                               nil nil #'string-equal)
                    (cl-incf (alist-get author (alist-get 'authors saved-value)
                                        nil nil #'string-equal)
                             count)
                  (push (cons author count) (alist-get 'authors saved-value))))
            (setq saved-value
                  (push `((emails . ((,email . ,count)))
                          (authors . ((,author . ,count))))
                        acc)))
          acc))
      authors
      :initial-value authors-init))))

Despite the probable we-enjoy-typing-ness of the implementation, it’s actually pretty simple:

  • The output of git log is parsed into a list of alists with count, email and author as keys.
  • This list is reduced by cl-reduce into a list of alists with emails and authors as keys and the respective counts as values, e.g. ((<email-1> . 1) (<email-2> . 3)).
    I’ve seen a couple of cases where people would swap their username and email (lol), so seq-find also looks for an email in the list of authors and vice versa.
  • The mapcar call determines the most popular email and username for each authors.

The output is another list of alists, now with the following keys:

  • emails - list of elements like (<email> . <count>)
  • authors - list of elements like (<author-name> . <count>)
  • email - the most popular email
  • author - the most popular username.

Running for multiple repos

This section was mostly informed by this page in the gource wiki.

As I said above, by default gource just creates a visualization for the current repo. To change something in it, we need to invoke the program like that: gource --output-custom-log PATH, where PATH is either the path to the log file or - for stdout.

The log consists of lines of pipe-separated strings, e.g.:

1600769568|dsofronov|A|/studentor/.dockerignore
1600769568|dsofronov|A|/studentor/.editorconfig
1600769568|dsofronov|A|/studentor/.flake8
1600769568|dsofronov|A|/studentor/.gitignore

where the values of one line are:

  • UNIX timestamp
  • Author name
  • A for add, M for modify, and D for delete
  • Path to file

The file has to be sorted by the timestamp in ascending order.

So, the function that prepares the log for one repository:

(defun my/gource-prepare-log (repo authors)
  "Create gource log string for REPO.

AUTHORS is the output of `my/git-get-authors'."
  (let ((log (shell-command-to-string
              (concat
               "gource --output-custom-log - "
               repo)))
        (authors-mapping (make-hash-table :test #'equal))
        (prefix (file-name-base repo)))
    (cl-loop for author-datum in authors
             for author = (alist-get 'author author-datum)
             do (my/gravatar-save (alist-get 'email author-datum) author)
             do (cl-loop for other-author in (alist-get 'authors author-datum)
                         unless (string-equal (car other-author) author)
                         do (puthash (car other-author) author
                                     authors-mapping)))
    (cl-loop for line in (split-string log "\n")
             concat (let ((fragments (split-string line "|")))
                      (when (> (length fragments) 3)
                        (when-let (mapped-author (gethash (nth 1 fragments)
                                                          authors-mapping))
                          (setf (nth 1 fragments) mapped-author))
                        (setf (nth 3 fragments)
                              (concat "/" prefix (nth 3 fragments))))
                      (string-join fragments "|"))
             concat "\n")))

This function:

  • Downloads a gravatar for each author
  • Replaces all usernames of one author with the most frequent one
  • Prepends the file path with the repository name.

The output is a string in the gource log format as described above.

Finally, as we need to invoke all of this for multiple repositories, why not do that with dired:

(defun my/gource-dired-create-logs (repos log-name)
  "Create combined gource log for REPOS.

REPOS is a list of strings, where a string is a path to a git repo.
LOG-NAME is the path to the resulting log file.

This function is meant to be invoked from `dired', where the required
repositories are marked."
  (interactive (list (or (dired-get-marked-files nil nil #'file-directory-p)
                         (user-error "Select at least one directory"))
                     (read-file-name "Log file name: " nil "combined.log")))
  (let ((authors
         (cl-reduce
          (lambda (acc repo)
            (my/git-get-authors repo acc))
          repos
          :initial-value nil)))
    (with-temp-file log-name
      (insert
       (string-join
        (seq-filter
         (lambda (line)
           (not (string-empty-p line)))
         (seq-sort-by
          (lambda (line)
            (if-let (time (car (split-string line "|")))
                (string-to-number time)
              0))
          #'<
          (split-string
           (mapconcat
            (lambda (repo)
              (my/gource-prepare-log repo authors))
            repos "\n")
           "\n")))
        "\n")))))

This function extracts authors from each repository and merges the logs as required by gource, that is sorting the result by time in ascending order.

Using the function

To use the function above, mark the required repos in a dired buffer and run M-x my/gource-dired-create-logs. This also works nicely with dired-subtree, in case your repos are located in different folders.

The function will create a combined log file (by default combined.log). To visualize the log, run:

gource <log-file> --user-image-dir <path-to-gravatars>

Check the README for possible parameters, such as the speed of visualization, different elements, etc. That’s it!

I thought about making something like a transient.el wrapper around the gource command but figured it wasn’t worth the effort for something that I run just a handful of times in a year.