Skip to main content

Quansight Labs Work Update for September, 2019

This post has been cross-posted on the Quansight Labs Blog.

As of November, 2018, I have been working at Quansight. Quansight is a new startup founded by the same people who started Anaconda, which aims to connect companies and open source communities, and offers consulting, training, support and mentoring services. I work under the heading of Quansight Labs. Quansight Labs is a public-benefit division of Quansight. It provides a home for a "PyData Core Team" which consists of developers, community managers, designers, and documentation writers who build open-source technology and grow open-source communities around all aspects of the AI and Data Science workflow.

My work at Quansight is split between doing open source consulting for various companies, and working on SymPy. SymPy, for those who do not know, is a symbolic mathematics library written in pure Python. I am the lead maintainer of SymPy.

In this post, I will detail some of the open source work that I have done recently, both as part of my open source consulting, and as part of my work on SymPy for Quansight Labs.

Bounds Checking in Numba

As part of work on a client project, I have been working on contributing code to the numba project. Numba is a just-in-time compiler for Python. It lets you write native Python code and with the use of a simple @jit decorator, the code will be automatically sped up using LLVM. This can result in code that is up to 1000x faster in some cases:


In [1]: import numba

In [2]: import numpy

In [3]: def test(x):
   ...:     A = 0
   ...:     for i in range(len(x)):
   ...:         A += i*x[i]
   ...:     return A
   ...:

In [4]: @numba.njit
   ...: def test_jit(x):
   ...:     A = 0
   ...:     for i in range(len(x)):
   ...:         A += i*x[i]
   ...:     return A
   ...:

In [5]: x = numpy.arange(1000)

In [6]: %timeit test(x)
249 µs ± 5.77 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [7]: %timeit test_jit(x)
336 ns ± 0.638 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [8]: 249/.336
Out[8]: 741.0714285714286

Numba only works for a subset of Python code, and primarily targets code that uses NumPy arrays.

Numba, with the help of LLVM, achieves this level of performance through many optimizations. One thing that it does to improve performance is to remove all bounds checking from array indexing. This means that if an array index is out of bounds, instead of receiving an IndexError, you will get garbage, or possibly a segmentation fault.

>>> import numpy as np
>>> from numba import njit
>>> def outtabounds(x):
...     A = 0
...     for i in range(1000):
...         A += x[i]
...     return A
>>> x = np.arange(100)
>>> outtabounds(x) # pure Python/NumPy behavior
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 4, in outtabounds
IndexError: index 100 is out of bounds for axis 0 with size 100
>>> njit(outtabounds)(x) # the default numba behavior
-8557904790533229732

In numba pull request #4432, I am working on adding a flag to @njit that will enable bounds checks for array indexing. This will remain disabled by default for performance purposes. But you will be able to enable it by passing boundscheck=True to @njit, or by setting the NUMBA_BOUNDSCHECK=1 environment variable. This will make it easier to detect out of bounds issues like the one above. It will work like

>>> @njit(boundscheck=True)
... def outtabounds(x):
...     A = 0
...     for i in range(1000):
...         A += x[i]
...     return A
>>> x = np.arange(100)
>>> outtabounds(x) # numba behavior in my pull request #4432
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: index is out of bounds

The pull request is still in progress, and many things such as the quality of the error message reporting will need to be improved. This should make debugging issues easier for people who write numba code once it is merged.

removestar

removestar is a new tool I wrote to automatically replace import * in Python modules with explicit imports.

For those who don't know, Python's import statement supports so-called "wildcard" or "star" imports, like

from sympy import *

This will import every public name from the sympy module into the current namespace. This is often useful because it saves on typing every name that is used in the import line. This is especially useful when working interactively, where you just want to import every name and minimize typing.

However, doing from module import * is generally frowned upon in Python. It is considered acceptable when working interactively at a python prompt, or in __init__.py files (removestar skips __init__.py files by default).

Some reasons why import * is bad:

  • It hides which names are actually imported.
  • It is difficult both for human readers and static analyzers such as pyflakes to tell where a given name comes from when import * is used. For example, pyflakes cannot detect unused names (for instance, from typos) in the presence of import *.
  • If there are multiple import * statements, it may not be clear which names come from which module. In some cases, both modules may have a given name, but only the second import will end up being used. This can break people's intuition that the order of imports in a Python file generally does not matter.
  • import * often imports more names than you would expect. Unless the module you import defines __all__ or carefully dels unused names at the module level, import * will import every public (doesn't start with an underscore) name defined in the module file. This can often include things like standard library imports or loop variables defined at the top-level of the file. For imports from modules (from __init__.py), from module import * will include every submodule defined in that module. Using __all__ in modules and __init__.py files is also good practice, as these things are also often confusing even for interactive use where import * is acceptable.
  • In Python 3, import * is syntactically not allowed inside of a function definition.

Here are some official Python references stating not to use import * in files:

  • The official Python FAQ:

    In general, don’t use from modulename import *. Doing so clutters the importer’s namespace, and makes it much harder for linters to detect undefined names.

  • PEP 8 (the official Python style guide):

    Wildcard imports (from <module> import *) should be avoided, as they make it unclear which names are present in the namespace, confusing both readers and many automated tools.

Unfortunately, if you come across a file in the wild that uses import *, it can be hard to fix it, because you need to find every name in the file that is imported from the * and manually add an import for it. Removestar makes this easy by finding which names come from * imports and replacing the import lines in the file automatically.

As an example, suppose you have a module mymod like

mymod/
  | __init__.py
  | a.py
  | b.py

with

# mymod/a.py
from .b import *

def func(x):
    return x + y

and

# mymod/b.py
x = 1
y = 2

Then removestar works like:

$ removestar -i mymod/
$ cat mymod/a.py
# mymod/a.py
from .b import y

def func(x):
    return x + y

The -i flag causes it to edit a.py in-place. Without it, it would just print a diff to the terminal.

For implicit star imports and explicit star imports from the same module, removestar works statically, making use of pyflakes. This means none of the code is actually executed. For external imports, it is not possible to work statically as external imports may include C extension modules, so in that case, it imports the names dynamically.

removestar can be installed with pip or conda:

pip install removestar

or if you use conda

conda install -c conda-forge removestar

sphinx-math-dollar

In SymPy, we make heavy use of LaTeX math in our documentation. For example, in our special functions documentation, most special functions are defined using a LaTeX formula, like The docs for besselj

(from https://docs.sympy.org/dev/modules/functions/special.html#sympy.functions.special.bessel.besselj)

However, the source for this math in the docstring of the function uses RST syntax:

class besselj(BesselBase):
    """
    Bessel function of the first kind.

    The Bessel `J` function of order `\nu` is defined to be the function
    satisfying Bessel's differential equation

    .. math ::
        z^2 \frac{\mathrm{d}^2 w}{\mathrm{d}z^2}
        + z \frac{\mathrm{d}w}{\mathrm{d}z} + (z^2 - \nu^2) w = 0,

    with Laurent expansion

    .. math ::
        J_\nu(z) = z^\nu \left(\frac{1}{\Gamma(\nu + 1) 2^\nu} + O(z^2) \right),

    if :math:`\nu` is not a negative integer. If :math:`\nu=-n \in \mathbb{Z}_{<0}`
    *is* a negative integer, then the definition is

    .. math ::
        J_{-n}(z) = (-1)^n J_n(z).

Furthermore, in SymPy's documentation we have configured it so that text between `single backticks` is rendered as math. This was originally done for convenience, as the alternative way is to write :math:`\nu` every time you want to use inline math. But this has lead to many people being confused, as they are used to Markdown where `single backticks` produce code.

A better way to write this would be if we could delimit math with dollar signs, like $\nu$. This is how things are done in LaTeX documents, as well as in things like the Jupyter notebook.

With the new sphinx-math-dollar Sphinx extension, this is now possible. Writing $\nu$ produces $\nu$, and the above docstring can now be written as

class besselj(BesselBase):
    """
    Bessel function of the first kind.

    The Bessel $J$ function of order $\nu$ is defined to be the function
    satisfying Bessel's differential equation

    .. math ::
        z^2 \frac{\mathrm{d}^2 w}{\mathrm{d}z^2}
        + z \frac{\mathrm{d}w}{\mathrm{d}z} + (z^2 - \nu^2) w = 0,

    with Laurent expansion

    .. math ::
        J_\nu(z) = z^\nu \left(\frac{1}{\Gamma(\nu + 1) 2^\nu} + O(z^2) \right),

    if $\nu$ is not a negative integer. If $\nu=-n \in \mathbb{Z}_{<0}$
    *is* a negative integer, then the definition is

    .. math ::
        J_{-n}(z) = (-1)^n J_n(z).

We also plan to add support for $$double dollars$$ for display math so that .. math :: is no longer needed either .

For end users, the documentation on docs.sympy.org will continue to render exactly the same, but for developers, it is much easier to read and write.

This extension can be easily used in any Sphinx project. Simply install it with pip or conda:

pip install sphinx-math-dollar

or

conda install -c conda-forge sphinx-math-dollar

Then enable it in your conf.py:

extensions = ['sphinx_math_dollar', 'sphinx.ext.mathjax']

Google Season of Docs

The above work on sphinx-math-dollar is part of work I have been doing to improve the tooling around SymPy's documentation. This has been to assist our technical writer Lauren Glattly, who is working with SymPy for the next three months as part of the new Google Season of Docs program. Lauren's project is to improve the consistency of our docstrings in SymPy. She has already identified many key ways our docstring documentation can be improved, and is currently working on a style guide for writing docstrings. Some of the issues that Lauren has identified require improved tooling around the way the HTML documentation is built to fix. So some other SymPy developers and I have been working on improving this, so that she can focus on the technical writing aspects of our documentation.

Lauren has created a draft style guide for documentation at https://github.com/sympy/sympy/wiki/SymPy-Documentation-Style-Guide. Please take a moment to look at it and if you have any feedback on it, comment below or write to the SymPy mailing list.

What's New in SymPy 1.4

This post has been cross-posted on the Quansight Blog.

As of November, 2018, I have been working at Quansight. Quansight is a new startup founded by the same people who started Anaconda, which aims to connect companies and open source communities, and offers consulting, training, support and mentoring services. I work under the heading of Quansight Labs. Quansight Labs is a public-benefit division of Quansight. It provides a home for a "PyData Core Team" which consists of developers, community managers, designers, and documentation writers who build open-source technology and grow open-source communities around all aspects of the AI and Data Science workflow. As a part of this, I am able to spend a fraction of my time working on SymPy. SymPy, for those who do not know, is a symbolic mathematics library written in pure Python. I am the lead maintainer of SymPy.

SymPy 1.4 was released on April 9, 2019. In this post, I'd like to go over some of the highlights for this release. The full release notes for the release can be found on the SymPy wiki.

To update to SymPy 1.4, use

conda install sympy

or if you prefer to use pip

pip install -U sympy

The SymPy 1.4 release contains over 500 changes from 38 different submodules, so I will not be going over every change, but only a few of the main highlights. A total of 104 people contributed to this release, of whom 66 contributed for the first time for this release.

While I did not personally work on any of the changes listed below (my work for this release tended to be more invisible, behind the scenes fixes), I did do the release itself.

Automatic LaTeX rendering in the Jupyter notebook

Prior to SymPy 1.4, SymPy expressions in the notebook rendered by default with their string representation. To get LaTeX output, you had to call init_printing():

SymPy 1.3 rendering in the Jupyter lab notebook

In SymPy 1.4, SymPy expressions now automatically render as LaTeX in the notebook:

SymPy 1.4 rendering in the Jupyter lab notebook

However, this only applies automatically if the type of an object is a SymPy expression. For built-in types such as lists or ints, init_printing() is still required to get LaTeX printing. For example, solve() returns a list, so does not render as LaTeX unless init_printing() is called:

SymPy 1.4 rendering in the Jupyter lab notebook with init_printing()

init_printing() is also still needed if you want to change any of the printing settings, for instance, passing flags to the latex() printer or selecting a different printer.

If you want the string form of an expression for copy-pasting, you can use print.

Improved simplification of relational expressions

Simplification of relational and piecewise expressions has been improved:

>>> x, y, z, w = symbols('x y z w')
>>> init_printing()
>>> expr = And(Eq(x,y), x >= y, w < y, y >= z, z < y)
>>> expr
x = y ∧ x ≥ y ∧ y ≥ z ∧ w < y ∧ z < y
>>> simplify(expr)
x = y ∧ y > Max(w, z)
>>> expr = Piecewise((x*y, And(x >= y, Eq(y, 0))), (x - 1, Eq(x, 1)), (0, True))
>>> expr
⎧ x⋅y   for y = 0 ∧ x ≥ y
⎪
⎨x - 1      for x = 1
⎪
⎩  0        otherwise
>>> simplify(expr)
0

Improved MathML printing

The MathML presentation printer has been greatly improved, putting it on par with the existing Unicode and LaTeX pretty printers.

>>> mathml(Integral(exp(-x**2), (x, -oo, oo)), 'presentation')
<mrow><msubsup><mo>&#x222B;</mo><mrow><mo>-</mo><mi>&#x221E;</mi></mrow><mi>&#x221E;</mi></msubsup><msup><mi>&ExponentialE;</mi><mrow><mo>-</mo><msup><mi>x</mi><mn>2</mn></msup></mrow></msup><mo>&dd;</mo><mi>x</mi></mrow>

If your browser supports MathML (at the time of writing, only Firefox and Safari), you should see the above presentation form for Integral(exp(-x**2), (x, -oo, oo)) below:

--x2x

Improvements to solvers

Several improvements have been made to the solvers.

>>> eq = Eq((x**2 - 7*x + 11)**(x**2 - 13*x + 42), 1)
>>> eq
                2
               x  - 13⋅x + 42
⎛ 2           ⎞
⎝x  - 7⋅x + 11⎠               = 1
>>> solve(eq, x) # In SymPy 1.3, this only gave the partial solution [2, 5, 6, 7]
[2, 3, 4, 5, 6, 7]

The ODE solver, dsolve, has also seen some improvements. Two new hints have been added.

'nth_algebraic' solves ODEs using solve by inverting the derivatives algebraically:

>>> f = Function('f')
>>> eq = Eq(f(x) * (f(x).diff(x)**2 - 1), 0)
>>> eq
⎛          2    ⎞
⎜⎛d       ⎞     ⎟
⎜⎜──(f(x))⎟  - 1⎟⋅f(x) = 0
⎝⎝dx      ⎠     ⎠
>>> dsolve(eq, f(x)) # In SymPy 1.3, this only gave the solution f(x) = C1 - x
[f(x) = 0, f(x) = C₁ - x, f(x) = C₁ + x]

'nth_order_reducible' solves ODEs that only involve derivatives of f(x), via the substitution $g(x)=f^{(n)}(x)$.

>>> eq = Eq(Derivative(f(x), (x, 2)) + x*Derivative(f(x), x), x)
>>> eq
               2
  d           d
x⋅──(f(x)) + ───(f(x)) = x
  dx           2
             dx
>>> dsolve(eq, f(x))
                  ⎛√2⋅x⎞
f(x) = C₁ + C₂⋅erf⎜────⎟ + x
                  ⎝ 2  ⎠

Dropping Python 3.4 support

This is the last release of SymPy to support Python 3.4. SymPy 1.4 supports Python 2.7, 3.4, 3.5, 3.6, 3.7, and PyPy. What's perhaps more exciting is that the next release of SymPy, 1.5, which will be released later this year, will be the last version to support Python 2.7.

Our policy is to drop support for major Python versions when they reach their End of Life. In other words, they receive no further support from the core Python team. Python 3.4 reached its end of life on May 19 of this year, and Python 2.7 will reach its end of life on January 1, 2020.

I have blogged in the past on why I believe it is important for library authors to be proactive in dropping Python 2 support, and since then a large number of Python libraries have either dropped support or announced their plans to by 2020.

Having Python 2 support removed will not only allow us to remove a large amount of compatibility cruft from our codebase, it will also allow us to use some Python 3-only features that will clean up our API, such as keyword-only arguments, type hints, and Unicode variable names. It will also enable several internal changes that will not be visible to end-users, but which will result in a much cleaner and more maintainable codebase.

If you are still using Python 2, I strongly recommend switching to Python 3, as otherwise the entire ecosystem of Python libraries is soon going to stop improving for you. Python 3 is already highly recommended for SymPy usage due to several key improvements. In particular, in Python 3, division of two Python ints like 1/2 produces the float 0.5. In Python 2, it does integer division (producing 1/2 == 0). The Python 2 integer division behavior can lead to very surprising results when using SymPy (imagine writing x**2 + 1/2*x + 2 and having the x term "disappear"). When using SymPy, we recommend using rational numbers (like Rational(1, 2)) and avoiding int/int, but the Python 3 behavior will at least maintain a mathematically correct result if you do not do this. SymPy is also already faster in Python 3 due to things like math.gcd and functools.lru_cache being written in C, and general performance improvements in the interpreter itself.

And much more

These are only a few of the highlights of the hundreds of changes in this release. The full release notes can be found on our wiki. The wiki also has the in progress changes for our next release, SymPy 1.5, which will be released later this year. Our bot automatically collects release notes from every pull request, meaning SymPy releases have very comprehensive and readable release notes pages. If you see any mistakes on either page, feel free to edit the wiki and fix them.

GitHub Cuts

GitHub recently announced its paper cuts initiative to fix minor issues that make things more difficult for GitHub users. As someone who spends most of his day on github.com, this initiative is great, as these small cuts can quickly add up to a painful experience.

The initiative has already made some great fixes, such as making the diff markers unselectable and hovercards. Small changes like these are usually quite easy for GitHub to do, but they make a huge difference to those of use who use GitHub every day.

I recently asked how these cuts could be reported to GitHub for fixing, but got no response. So I am writing this blog post.

To be very clear: I think that on the whole GitHub is great and they are doing a great job. And it's still better than the alternatives (to put things in perspective, I recently spent half an hour trying to figure out how to change my password in BitBucket, and GitLab can't even keep me logged in between sessions). GitHub has and continues to revolutionize the open source ecosystem, and is still the best place to host an open source project.

But since GitHub did ask what sorts of changes they want to see, I'm providing a list. In this post I'm trying to only ask about things that are small changes (though I realize many won't be as easy to fix as they may appear from the outside, and I readily admit that I am not a web developer).

These are just the things that have bothered me, personally. Other people use GitHub differently and no doubt have their own pain points. For instance, I have no suggestions about the project boards feature of GitHub because I don't use it. If you are also a GitHub user and have your own pain points feel free to use the comment box below (though I have no idea if GitHub will actually see them).

If you work for GitHub and have any questions, feel free to comment below, or email me.

In no particular order:

Issues

  1. Allow anyone to add labels to issues. At the very least, allow the person who opened the issue to add labels.

  2. The new issue transfer ability is great, but please make it require only push access, not admin access.

  3. Remove the automatic hiding of comments when there are too many. I understand this is done for technical reasons, but it breaks Cmd-F/scrolling through the page to find comments. Often I go to an issue trying to find an old comment and can't because buried in the comments is a button I have to press to actually show the comment (it's even worse when you have to find and press the button multiple times).

  4. Better indication for cross-referenced pull requests. I really don't know how to fix this, only that it is a problem. It happens all the time that a new contributor comes to a SymPy issue and asks if it has been worked on yet. They generally do not seem to notice the cross-referenced pull requests in the list. Here is an example of what I'm talking about.

  5. Indicate better if a cross-referenced pull request would close an issue. Preferably with text, not just an icon.

  6. HTML pull request/issue templates. I don't know if this counts as a "cut", as it isn't a simple fix. Right now, many projects use pull requests/new issue templates, but it is not very user friendly. The problem is that the whole thing is done in plain text, often with the template text as an HTML comment to prevent it from appearing in the final issue text. Even for me, I often find this quite difficult to read through, but for new contributors, we often find that they don't read it at all. Sure there's no way to force people to read, but if we could instead create a very simple HTML form for people to fill out, it would be much more friendly, even to experienced people like myself.

  7. Fix the back button in Chrome. I don't know if this is something that GitHub can fix, and I also do not know how things work in other browsers. I use Chrome on macOS. Often, when I click the "back" button and it takes me back to an issue page, the contents of the page are out-of-date (the newest comments or commits do not appear). It's often even more out-of-date than it was when I left the page. I have to reload the page to get the latest content.

  8. Allow Markdown formatting in issue titles.

  9. Show people's names next to comments as "Real Name (@username)". In general, GitHub should be emphasizing people's display names rather than their usernames.

  10. Remember my selection for the "sort" setting in the issues list. I'd love to have issues/pull requests sort by "most recently updated" by default, so that I don't miss updates to old issues/pull requests.

  11. Make advanced search filters more accessible. They should autofill, similar to how Gmail or even GitLab search works (yes, please steal all the good ideas from GitLab; they already stole all their good ideas from you).

  12. Tone down the reaction emojis. Maybe this ship has sailed, but reaction emojis are way too unprofessional for some projects.

  13. Copy/paste text as Markdown. For example, copying "bold" and pasting it into the comment box would paste **bold**. Another idea that you can steal from GitLab.

  14. Strike out #12345 issue links when the issue/PR is closed/merged (like #12345).

Pull requests

  1. Add a button that automatically merges a pull request as soon as all the CI checks pass. Any additional commits pushed to the branch in the interim would cancel it, and it should also be cancellable by someone else with push access or the PR author.

  2. Add some way to disable the automatic hiding of large diffs. This breaks Cmd-F on the page, and makes it harder to scroll through the pull request to find what you are looking for (typically the most important changes are the ones that are hidden!).

  3. Include all issue/PR authors in the authors search on the pull request list page. Right now it only lists people with push access. But I generally can't remember people's GitHub usernames, and autofilling based on all authors would be very helpful.

  4. Better contextual guesses for issue autofilling (after typing #). For instance, if an issue has already been referenced or cross-referenced, it should appear near the top of the list. We have almost 3000 open issues in SymPy, and the current issue numbers are 5-digits long, so referencing an issue purely by number is very error prone.

  5. Auto-update edited comments. Context: SymPy uses a bot that comments on every pull request, requiring a release notes entry to be added to the description. This works quite well, but to prevent the bot from spamming, we have it comment only once, and edit its comment on any further checks. However, these edits do not automatically update, so people have to manually reload the page to see them.

  6. Don't hide full commit messages by default in the commits view. It would be better to encourage people to write good commit messages.

  7. Make issue cross-references in pull request titles work. I'd rather people didn't put issue numbers in pull request titles but they do it anyway, so it would be nice if they actually worked as links.

  8. Allow me to comment on lines that aren't visible by default. That is, lines that you have to click the "expand" icon above or below the line numbers to access. As an example, this can be useful to point out a line that should have been changed but wasn't.

  9. Copying code from a diff that includes lines that aren't visible by default includes an extra space to the left for those lines. This is a straight up bug. Probably fixing the previous point would also fix this :)

  10. Make searches include text from the pull request diff.

  11. When a diff indents a line color the whitespace to the left of the line. (see this)

  12. Pull requests can show commits that are already in master. For example, if someone makes pull request B based off of pull request A and then A gets merged, B will still show the commits from A. This has been a bug forever.

  13. Make the "jump to file or symbol" popdown collapsible. Specifically what I mean is I want to be able to show just the files, without any symbols. For large pull requests, it is very difficult to use this popdown if there are hundreds of symbols. I typically want to just jump to a specific file.

  14. The status check on the favicon goes away when you switch to the diff tab. Kudos to Marius Gedminas for pointing this out.

  15. Apparently status checks that use the GitHub Apps API are forced to link into the checks tab. The checks tab is useless if no information is actually published to it. It would be better if it could link straight to the external site, like is done with oauth integrations.

  16. Make it easier to copy someone's username from the pull request page. I generally do this to git remote add them (using hub). If I try to select their username from a comment, it's a link, which makes it hard to select. I generally copy it from the blue text at the top "user wants to merge n commits from sympy:master from user:branch". If it were easier to select "user" or "branch" from that box (say, by double clicking), that would be helpful.

  17. Change the "resolve conversation" UI. I keep pressing it on accident because it's where I expect the "new comment" button to be.

Reviews

I wrote a whole post about the reviews feature when it came out. Not much has changed since then (actually, it has gotten worse). In short, the feature doesn't work like I would like it too, and I find the default behavior of deferred comments to be extremely detrimental. If there were a way to completely disable reviews (for myself, I don't care about other people using the feature), I would.

See my blog post for full details on why I think the reviews feature is broken and actually makes things worse, not better than before. I've summarized a few things that could change below.

  1. Make reviews non-deferred by default. This is the biggest thing. If I had to pick only a single item on this page to be changed, it would be this. The issue is if I start a review and walk away from it, but forget to "finalize" it with a review status, the review is never actually seen by anyone. The simplest way to fix this would be to simply make partial reviews public.

  2. Make Cmd-Enter default to immediate comment. Barring the above change, Cmd-Enter on a pull request line comment should default to immediate comment, not deferred (review) comment. The problem with the Cmd-Shift-Enter shortcut is that it is inconsistent: on a normal comment, it closes the pull request, and on a reply comment, it does nothing. I shouldn't have to check what "comment context" I am in to figure out what keyboard shortcut to use. The worst part is if you accidentally start a review, it's a pain in the ass to undo that and just post a normal comment. The simplest way to fix this would be to swap the current meaning of Cmd-Enter and Cmd-Shift-Enter for line comments (and no, this wouldn't be a backwards incompatible change, it would be a regression fix; Cmd-Enter used to do the right thing).

  3. Allow reviewing your own pull request. There's no reason to disallow this, and it would often be quite useful to, for instance, mark a work in progress PR as such with a "request changes" review. Obviously self-reviews would be excluded from any required reviews.

  4. Unhide the reviews box. It should just be the same box as the comment box, unstead of buried on the diff tab (see my blog post).

  5. Show review status in the pull request list as a red X or green check. This would make it easier to see which pull requests have reviews.

  6. Allow new commits to invalidate reviews. That way they work the same way as any other status check. (I see that this is now an option for required reviews, which is new since my original blog post, but it still doesn't affect the status as reported on the pull requests list).

  7. Allow requiring zero negative reviews to merge (but not necessarily any positive reviews). Requiring a positive review is pointless. The person merging can just add one real quick before they merge, but it is unnecessary extra work. On the other hand allowing people with push access to block a merge with a negative review would be very useful.

Web editor

  1. The web editor seems to have a search function, but I can't get it to actually work. Half the time Cmd-F pops open the browser search, which doesn't find text that isn't on screen. And when I press Cmd-G to actually do the search, it doesn't work (and there are no buttons to perform the search either).

  2. Add basic syntax testing in the web editor for common languages to catch basic mistakes.

Mobile site

  1. Please make the mobile site work with iOS 10. I don't see any reason why simple things like buttons (like the merge button or the comment button) shouldn't work on a slightly older browser. No, I am not a web developer, but I do use my phone a lot and I've noticed that literally every other website works just fine on it.

  2. Add a way to disable the mobile site permanently. For the most part, the mobile site is useless (see below). If you aren't going to put full development effort into it, allow me to disable it permanently so that every time I visit github.com on my phone it goes to the desktop site.

Seeing as how the site (mobile or not) is almost completely unusable on every mobile device I own, it's hard to list other things here, but based on back when it actually worked, these are some of the things that annoyed me the most. Basically, I have found that virtually every time I go to GitHub to do anything on mobile, I have to switch to desktop mode to actually do what I want.

My apologies if any of these actually work now: as I said, github.com doesn't actually work at all on my phone.

  1. Cannot search issues on mobile.

  2. Cannot make a line comment that isn't a review on mobile.

  3. Cannot view lines beyond the default diff in pull requests on mobile.

  4. Show more than 2 lines of the README and 0 lines of the code by default on project pages. Yes mobile screens are small but it's also not hard to scroll on them.

  5. Support Jupyter notebook rendering on mobile.

Files view

  1. GitHub needs a better default color theme for syntax highlighting. Most of the colors are very similar to one another and hard to differentiate. Also things like strings are black, even though one of the most useful aspects of syntax highlighting generally is to indicate whether something is in a string or not.

  2. Add MathJax support to markdown files. This would be amazingly useful for SymPy, as well as many scientific software projects. Right now if you want this you have to use a Jupyter notebook. MathJax support in issue/pull request comments would be awesome as well, though I'm not holding out for that.

  3. Add "display source" button for markdown, ReST, etc. I mean the button that is already there for Jupyter notebooks. Right now you have to view markdown and ReST files "raw" or edit the file to see their source.

  4. Add a link to the pull request in the blame view. Usually I want to find the pull request that produced a change, not just the commit.

Wiki

  1. The wikis used to support LaTeX math with MathJax. It would be great if this were re-added.

  2. The ability to set push permissions for the wiki separately from the repo it is attached to, or otherwise create an oauth token that can only push to the wiki would be useful. Context: for SymPy, we use a bot that automatically updates our release notes on our wiki. It works quite well, but the only way it can push to the wiki is if we give it push access to the full repo.

Notification emails

  1. Don't clobber special emails/email headers. GitHub adds special emails like author@noreply.github.com and mention@noreply.github.com to email notifications based on how the notification was triggered. This is useful, as I can create an email filter for author@noreply.github.com for notifications on issues and pull requests created by me. The bad news is, mention@noreply.github.com, which is added when I am @mentioned, clobbers author@noreply.github.com, so that it doesn't appear anymore. In other words, as soon as someone @mentions me in one of my issues, I become less likely to see it, because it no longer gets my label (I get @mentioned on a lot of issues and don't have the ability to read all of my notification emails). Ditto for the X-GitHub-Reason email headers.

  2. Readd the "view issue" links in Gmail. (I forgot what these are called). GitHub notification emails used to have these useful "view issue" buttons that showed up on the right in the email list in Gmail, but they were removed for some reason.

API

  1. Make the requests in the API docs actually return what they show in the docs. This means the example repo should have actual example issues corresponding to what is shown in the docs.

  2. Allow giving deploy key access to just one branch. That way I can have a deploy key for gh-pages and minimize the attack surface that the existence of that key produces. I think everyone would agree that more fine-grained permissions throughout the API would be nice, but this is one that would benefit me personally, specifically for my project doctr.

GitHub Pages

GitHub pages is one of the best features of GitHub, and in fact, this very blog is hosted on it. Very few complaints here, because for the most part, it "just works".

  1. Moar themes. Also it's awesome that you can use any GitHub repo as a theme now, but it turns out most random themes you find around GitHub don't actually work very well.

  2. The steps to add HTTPS to an existing GitHub pages custom domain are a bit confusing.. This took us a while to figure out for sympy.org. To get things to work, you have to trigger GitHub to issue a cert for the domain. But the UI to issue the cert is to paste the domain into the box. So if the domain is already there but it doesn't work, you have to re-enter it. Also if you want both www and the apex domain to be HTTPS you have to enter them both in the box to trigger GitHub to issue a cert. This is primarily a UX issue. See https://github.com/sympy/sympy.github.com/issues/105#issuecomment-415899934.

Settings

  1. Automatically protected branches make the branch difficult to delete when you are done with it. My use-case is to create a branch for a release, which I want to protect, but I also want to delete the branch once it is merged. I can protect the branch automatically pretty easily, but then I have to go and delete the protection rule when it's merged to delete it. There are several ways this could be fixed. For instance, you could add a rule to allow protected branches to be deleted if they are up-to-date with default branch.

  2. Add a way to disable the ability for non-admins to create new branches on a repo. We want all of our pull requests to come from forks. Branches in the repo just create confusion, for instance, they appear whenever someone clones the repository.

  3. Related to the previous point, make pull request reverts come from forks. Right now when someone uses the revert pull request button, it creates a new branch in the same repo, but it would be better if the branch were made in the person's fork.

  4. Allow me to enable branch protection by default for new repos.

  5. Allow me to enable branch protection by default on new branches. This is more important than the previous one because of the feature that lets people push to your branch on a pull request (which is a great feature by the way).

  6. Clicking a team name in the settings should default to the "members" tab. I don't understand why GitHub has a non-open "discussions" feature, but I find it to be completely useless, and generally see such things as harmful for open source.

  7. Suggest people to add push access to. I don't necessarily mean passively (though that could be interesting too), but I mean in the page to add someone, it would be nice if the popup suggested or indicated which people had contributed the project before, since just searching for a name searches all of GitHub, and I don't want to accidentally give access to the wrong person.

Profiles

  1. Stop trying to make profile pages look "cute" with randomly highlighted pull requests. GitHub should have learned by now that profile pages matter a lot (whether people want them to or not), and there can be unintended consequences to the things that are put on them.

  2. Explain what the axes actually mean in the new "activity overview". I'm referring to this (it's still in beta and you have to manually enable it on your profile page). Personally I'm leaving the feature off because I don't like being metricized/gamified, but if you're going to have it, at least include some transparency.

Releases

  1. Allow hiding the "source code (zip)" and "source code (tar.gz)" files in a release. We upload our actual release files (generated by setup.py) to the GitHub release. We want people to download those, not snapshots of the repo.

Miscellaneous

  1. The repository search function doesn't support partial matches. This is annoying for conda-forge. For instance, if I search for "png" it doesn't show the libpng-feedstock repo.

  2. Show commit history as a graph. Like git log --graph. This would go a long way to helping new users understand git. When I first started with git, understanding the history as a graph was a major part of me finally grokking how it worked.

  3. Bring back the old "fork" UI. The one that just had icons for all the repos, and the icons didn't go away or become harder to find if you already had a fork. Some of us use the "fork" button to go to our pre-existing forks, not just to perform a fork action. This was recently changed and now it's better than it was, but I still don't see why existing forks need to be harder to find, visually, than nonexisting ones.

  4. Provide a more official way to request fixes to these cuts. I often ask on Twitter, but get no response. Preferably something public so that others could vote on them (but I understand if you don't want too much bikeshedding).