It's been over two years since the last major release of ekg. Ever since the first release I knew that there were a number of features I wanted to have in ekg that I didn't implement back then. This release adds most of them.
Integration with other monitoring systems
When I first wrote ekg I knew it only solved half of the program monitoring problem. Good monitoring requires two things
- a way to track what your program is doing, and
- a way to gather and persist that data in a central location.
The latter is neccesary because
- you don't want to lose your data if your program crashes (i.e. ekg only stores metrics in memory),
- you want to get an aggregate picture of your whole system over time, and
- you want to define alarms that go off if some metric passes some threshold.
Ekg has always done (1), as it provides a way to define metrics and inspect their values e.g. using your web browser or curl.
Ekg could help you to do (2), as you could use the JSON API to sample metrics and then push them to an exiting monitoring solution, such as Graphite or Ganglia. However, it was never really convenient.
Today (2) will get much easier.
Statsd integration
Statsd is
A network daemon that ... listens for statistics, like counters and timers, sent over UDP and sends aggregates to one or more pluggable backend services (e.g., Graphite).
Statsd is quite popular and has both client and server implementations in multiple languages. It supports quite a few backends, such as Graphite, Ganglia, and a number of hosted monitoring services. It's also quite easy to install and configure (although many of the backends it uses are not.)
Ekg can now be integrated with statsd, using the ekg-statsd package. With a few lines you can have your metrics sent to a statsd:
main = do
store <- newStore
-- Register some metrics with the metric store:
registerGcMetrics store
-- Periodically flush metrics to statsd:
forkStatsd defaultStatsdOptions store
ekg-statsd can be used either together with ekg, if you also want the web interface, or standalone if the dependencies pulled in by ekg are too heavyweight for your application or if you don't care about the web interface. ekg has been extended so it can share the Server
's metric store with other parts of the application:
main = do
handle <- forkServer "localhost" 8000
forkStatsd defaultStatsdOptions (serverMetricStore handle)
Once you set up statsd and e.g. Graphite, the above lines are enough to make your metrics show up in Graphite:
Integration with your monitoring systems
The ekg APIs have been re-organized and the package split such that it's much easier to write your own package to integrate with the monitoring system of your choice. The core API for tracking metrics has been split out from the ekg package into a new ekg-core package. Using this package, the ekg-statsd implementation could be written in a mere 121 lines.
While integrating with other systems was technically possible in the past, using the ekg JSON API, it was both inconvenient and wasted CPU cycles generating and parsing JSON. Now you can get an in-memory representation of the metrics at a given point in time using the System.Metrics.sampleAll
function:
-- | Sample all metrics. Sampling is /not/ atomic in the sense that
-- some metrics might have been mutated before they're sampled but
-- after some other metrics have already been sampled.
sampleAll :: Store -> IO Sample
-- | A sample of some metrics.
type Sample = HashMap Text Value
-- | The value of a sampled metric.
data Value = Counter !Int64
| Gauge !Int64
| Label !Text
| Distribution !Stats
All that ekg-statsd does is to call sampleAll
periodically and convert the returned Value
s to UDP packets that it sends to statsd.
Namespaced metrics
In a large system each component may want to contribute their own metrics to the set of metrics exposed by the program. For example, the Snap web server might want to track the number of requests served, the latency for each request, the number of requests that caused an internal server error, etc. To allow several components to register their own metrics without name clashes, ekg now supports namespaces.
Namespaces also makes it easier to navigate metrics in UIs. For example, Graphite gives you a tree-like navigation of metrics based on their namespaces.
In ekg dots in metric names are now interpreted as separating namespaces. For example, the default GC metric names now all start with "rts.gc.". Snap could for example prefix all its metric names with "snap.". While this doesn't make collisions impossible, it should make them much less likely.
If your library want to provide a set of metrics for the application, it should provide a function that looks like this:
registerFooMetrics :: Store -> IO ()
The function should call the various register functions in System.Metrics
. It should also document which metrics it registers. See System.Metrics.registerGcMetrics
for an example.
A new metric type for tracking distributions
It's often desirable to track the distribution of some event. For example, you might want to track the distribution of response times for your webapp, so you can get notified if things are slow all of a sudden and so you can try to optimize the latency.
The new Distribution
metric lets you do that.
Every time an event occurs, simply call the add
function:
add :: Distribution -> Double -> IO ()
The add function takes a value which could represent e.g. the number of milliseconds it took to serve a request.
When the distribution metric is later sampled you're given a value that summarizes the distribtuion by providing you with the mean, variance, min/max, and so on.
The implementation uses an online algorithm to track these statistics so it uses O(1) memory. The algorithm is also numerically stable so the statistics should be accurate even for long-running programs.
While it didn't make this release, in the future you can look forward to being able to track both quantiles and keep histrograms of the events. This will let you track e.g. the 95-percentile response time of your webapp.
Counters and gauges are always 64-bits
To keep ekg more efficient even on 32-bit platforms, counters and gauges were stored as Int
values. However, if a counter is increased 10,000 times per second, which isn't unusual for a busy server, such a counter would wrap around in less than 2.5 days. Therefore all counters and gauges are now stored as 64-bit values. While this is technically a breaking change, it shouldn't affect the majority of users.
I received a report of contention in ekg when multiple cores were used. This prompted me to improve the scaling of all metrics types. The difference is quite dramatic on my heavy contention benchmark:
+RTS -N1 +RTS -N6
Before 1.998s 82.565s
After 0.117s 0.247s
The benchmark updates a single counter concurrently in 100 threads, performing 100,000 increments per thread. It was run on a 6 core machine. The cause of the contention was atomicModifyIORef
, which has been replaced by an atomic-increment instruction. There are some details on the GHC Trac.
In short, you shouldn't see contention issues anymore. If you, I still have some optimizations that I didn't apply because the implementation should already be fast enough.
If anyone does decided to do individual modules in parallel I would suggest building on top of Shake, which already compiles individual modules in parallel. However, I would warn you that I've seen > 2x slower. If you are building one project you need about 4 CPU to outperform (since you don't generally get a consistent parallelism level of 4), or if you are building multiple projects (where you can get better parallelism) then 3 CPU is enough.
ReplyDeletewhy is `ghc -c` slower?
ReplyDeleteghc --make can cache the parsed .hi files to avoid reparsing for each .hs file.
Deletehow about giving the final polish to the key xml and exception libraries?
ReplyDeletexml was left just 1inch away from being done and xml is v popular in large companies
imo if we want haskell to become popular in large companies we need to give them the right tools to play with..
hxt for example
- needs to have a full schema implementation (has a large % of it now)
- needs to have some clear examples on how to use it
etc
---
exception handling:
- some good examples on how to use them for large projects, not just the hello world with just and nothing
- mixing different exc handling paradigms (io exc with either, with custom with...)
---
providing good quality, robust tools is the first sign of a mature language, that will be inviting to newcomers and large companies alike by addressing their standard risk-aversion upfront and helping them make the first steps towards haskell enlightenment ;)
thanks for organizing this
I believe this to be endemic in the Haskell community. Very few haskell projects are released with full documentation let alone usage examples or tutorials. There are notable exceptions to this rule, but on the whole, documentation is not haskell's strong point.
Delete-----------------------------------------------
Case Study: Yesod/Persistent
I'll give snoyberg a partial pass here because the library is still being reworked heavily, but the "Yesod Book" is woefully inadequate where it is not downright outdated. Reading the source code for persistent is hit-and miss.
* Some functions are heavily documented, but a lot of the important ones are not documented at all. (can anyone tell me why a "CREATe TABLE" statement is *required* to have a lowercase e? http://hackage.haskell.org/packages/archive/persistent-postgresql/latest/doc/html/src/Database-Persist-Postgresql.html#line-293
* The Book focusses on usage patterns, and not on underlying mechanics or documenting of public functions. As soon as you stray from the beaten path there is no way to find your way back, other than the IRC channel. Compiler Error Driven Programming is not my cup of tea.
* The example project comes with lots of docs on what is being setup where, but not why. Is it even possible to serve static files under 2 different urls? So much magic is thrown in the mix, it feels like Django pre-0.9.
(It may just be me but this code is incomprehensible to me http://hackage.haskell.org/packages/archive/persistent-template/latest/doc/html/src/Database-Persist-TH.html#mkMigrate)
All of this is so frustrating to a new user like myself, but I know that as soon as I post this people will exclaim "Just read the source code! It's easy to understand! "
------------------------------------------------------------------
The "Gold Standard" for any software package or language is that both the Standard Libraries and the core set of packages (Haskell Platform) should be brain-dead obvious enough from the docs/tutorial that a moderately skilled programmer can pick up and run with the core feature set without having to consult anything but the docs/tutes. At the moment, I believe that the Haskell community is not living up to that standard.
fully agree with your point @anon
Deletethat's exactly the reason i've asked Johan to help add some pro touches to a few of the keys libraries in haskell, that can be more useful for a large/key audience
the issue is at a larger scale indeed, but let's start that 10000 mile trip with one step in the right direction
ps. haskell community is great, most of the times not snobby, but i agree with you - for basic to intermediary things a newbie should be able to find his way quickly in order for the language to get traction.
for that plenty of good examples will help
(from that perspective i like the lyah book - the real world one is outdated - also requires some brushing :)
--
imo if we have a rock solid xml, exception handling (enterprise stuff)
followed by web and database (web stuff) in a similar way we'll be in great shape