Every year I put together a list of Google Summer of Code projects I'd like see students work on. Here's my list for this year.
As normal the focus is on existing infrastructure. I believe, and I think our experience in the past bears out, that such projects are more successful.
Improved Hackage login
The Hackage login/user system could use several improvements.
From a security perspective, we need to switch to a current best practice implementation of password storage, such as bcrypt, scrypt, or PBKDF2. MD5, which is what HTTP Digest Auth uses, has known attacks.
From a usability perspective, we need to move to a cookie-based login system. While using HTTP auth is convenient from an implementation perspective, it doesn't work well from a usability perspective (that's why sites that otherwise try to follow the REST approach don't use HTTP auth.) A cookie-based approach allows us to, among other things,
- display the current login status of the user,
- allow users to conveniently access a user preference page,
- allow users to log out, and
- adapt the UI to the current user.
An example of the latter would be to only show a link to the maintainer section for packages you maintain or show additional actions for the site admins. HTTP auth introduces an extra page transition if you want to move from a list to items to edit that list of items (e.g. you can edit uploaders on /packages/uploaders/
, you need to click on the link that takes you to /packages/uploaders/edit
.) This is because HTTP auth does authentication on a per HTTP request basis.
Other Hackage usability improvements
There are several other Hackage usability improvements I'd like to see.
The homepage is currently a write-up about the new Hackage server. While that made sense when the new Hackage server was brand new, a more useful homepage would include a list of recently updates packages, most popular packages, packages you maintain, and a link to "getting started" material and other documentation. Looking at other languages' package repo homepages for inspiration wouldn't be a bad start.
The search result page should include download counts and a more easily scannable result list. The current list is hard to read because the package descriptions don't line up. For example, compare the search result page for "xml" for Hackage and Ruby Gems.
Faster Cabal/GHC parallel builds
Mikhail Glushenkov and others have done a great job making our compiles faster. Cabal already builds packages in paralell and with GHC 7.8 it will build modules in parallel as well.
There are still more opportunities for parallelism. Cabal doesn't build individual components or different versions of the same component (e.g. vanilla and profiling) in parallel.
Building all the test suites in parallel would save time if you have many test suites and building vanilla and profiling versions at the same time would allow users to turn on profiling by default (in ~/.cabal/config
) without paying (much of) a compile time penalty.
There's already some work underway here so there might not be enough Cabal work to last a student through the summer. The remaining time could be spent increasing the amount of parallism offered by ghc -j
.
Today the parallel speed-up offered by ghc -j
is quite modest and I believe we ought to be able to increase it. If you exclude link times, if we had N independent modules of the same size we should get close to a N times parallel speed-up, which I don't think we do today. While real packages don't have this much available parallelism, improvements in the embarrasingly parallel case should help the average case.
Cabal file pretty-printer
If we had a Cabal file pretty printer, in the spirit of go-fmt for Go, we could more easily apply automatic rewrites to Cabal files. Having a formatter that applies a standard (i.e. normalizing) format to all files would make rewrites tools much simpler, as they wouldn't have to worry about preserving user formatting. Some tools that would benefit:
- cabal freeze, which will be included in Cabal-1.20
- cabal init
- A cabal version number bumper/PVP helper
I don't think such a pretty-printer should be terribly clever. Since Cabal files don't support pattern matching (like Haskell), aligning things doesn't really help readability much. Something simple like a 2 (or 4) space ident and starting each list of items on a new line below the item "header" ought to be enough. Here's an example:
name: Cabal
version: 1.19.2
copyright: 2003-2006, Isaac Jones
2005-2011, Duncan Coutts
license: BSD3
license-file: LICENSE
author:
Isaac Jones <ijones@syntaxpolice.org>
Duncan Coutts <duncan@community.haskell.org>
maintainer: cabal-devel@haskell.org
homepage: http://www.haskell.org/cabal/
bug-reports: https://github.com/haskell/cabal/issues
synopsis: A framework for packaging Haskell software
description:
The Haskell Common Architecture for Building Applications and
Libraries: a framework defining a common interface for authors to
more easily build their Haskell applications in a portable way.
.
The Haskell Cabal is part of a larger infrastructure for
distributing, organizing, and cataloging Haskell libraries and
tools.
category: Distribution
cabal-version: >=1.10
build-type: Custom
extra-source-files:
README tests/README changelog
source-repository head
type: git
location: https://github.com/haskell/cabal/
subdir: Cabal
library
build-depends:
base >= 4 && < 5,
deepseq >= 1.3 && < 1.4,
filepath >= 1 && < 1.4,
directory >= 1 && < 1.3,
process >= 1.0.1.1 && < 1.3,
time >= 1.1 && < 1.5,
containers >= 0.1 && < 0.6,
array >= 0.1 && < 0.6,
pretty >= 1 && < 1.2,
bytestring >= 0.9
if !os(windows)
build-depends:
unix >= 2.0 && < 2.8
ghc-options: -Wall -fno-ignore-asserts -fwarn-tabs
Don't we already have a pretty-printer under Distribution.PackageDescription.PrettyPrint? Exposing it as a 'cabal format' command should be easy.
ReplyDeleteI wasn't aware of it. Does it preserve ordering of sections or at least do something reasonable so e.g. all the packages in a deps section don't get randomly reordered if you add another package?
DeleteIt prints out sections in canonical order (library > executables > tests > benchmarks). Not sure about dependencies.
DeleteRe: parallel building of components - I hope to finish this in the near future. It'd be nice to have a faster 'ghc --make -j', but it's unclear to me what exactly needs to be improved and how hard that would be. Would be nice if Patrick (the author of parallel 'ghc --make' patches) could comment.
ReplyDelete