[Haskell-cafe] Project idea, seeking feedback
Markus Läll
markus.l2ll at gmail.com
Wed Nov 8 21:10:53 UTC 2017
Hi Alex,
this is obviously highly ambitious if you want to get this right. If
you actually plan to start on this, then since there are plenty of
DSL's that eventually run on one machine, then where I would start is
the distributed part. I.e make something that passes around an Int,
and have it deploy to any number of machines. Then add gradually add
complexity, like distributed queues and workers, ways to enforce
ordering on when results of works are to be
submitted/accepted. Implementing precedence graphs would be
interesting. Then there is limiting congestion and probably many more
kinds of limits you want to add to different points in the graph.
Just some random ideas. But start and deploy something very simple at
first.
On Wed, Nov 8, 2017 at 2:21 PM, Alex <alex at centromere.net> wrote:
> Hi all,
>
> I'm seeking feedback for a project I'd like to start. I have a bit of
> experience developing large scale systems containing many
> microservices, databases, message queues, and caches over many VMs. Time
> and time again I find myself confronted with the same problems:
>
> 1. It is difficult to trace events through the system: Consider an HTTP
> request made by a customer to a public API. Which microservices were
> impacted by that request? What SQL queries were run as a result of that
> request? What 3rd party APIs were consulted during the request's
> fulfillment? Answers to these questions are essential to fixing bugs
> quickly, and yet they are so difficult to answer (at least in my
> experience).
>
> 2. Problems are difficult to reproduce: When Customer Success walks in
> and says, "I have an angry customer on the phone. They want to know why
> [FOO] wasn't properly [BAR]" it is often impossible to give an answer
> without interactive troubleshooting and hours of grepping through
> unstructured log files. Troubleshooting may incur additional expenses
> too, since (for instance) you may hit your API request limit for a 3rd
> party service.
>
> 3. Business and non-business logic are not well encapsulated: Often I
> see code related to (for example) RabbitMQ interwoven with core business
> logic when calls need to be made to other microservices. The fact that
> RabbitMQ facilitates communication between microservices is an
> implementation detail that I shouldn't have to think about.
>
> 4. Resource consumption is non-uniform: Some microservices are more
> demanding than others in terms of CPU, memory, and disk usage.
> Achieving optimal "packing" is difficult. In other words, some VMs
> will have a high load and others will remain idle. Auto scaling groups
> can help with this in theory, but I don't think they can achieve the
> kind of density I would like to see.
> Moreover, what constitutes a "resource"? If a 3rd party service
> rate limits requests by IP address, couldn't each request be considered
> a resource unit which needs to be properly load balanced, just as you
> would with CPU?
>
> Given these motivations, I would like to flesh out some ideas for a
> framework/platform which addresses these issues. These ideas are
> half-baked and may not tie in well with one another.
>
> I envision a distributed system as follows:
>
> 1. One kind of VM:
> DevOps people have a saying: "Treat your VMs like cattle, not
> pets". In practice, "cattle" becomes "cows, chickens, pigs, and
> lobster". VMs typically have an assigned role, and they become part of
> a group which may or may not be auto-scaling. For a given instantiation
> of this hypothetical platform, I would like to see a single kind of VM.
> That is, every VM is identical to every other VM, and they all run the
> same Haskell application.
>
> 2. Strict separation of business and non-business logic: The framework
> should handle all aspects of communication between nodes (like Cloud
> Haskell does) in a pluggable and transparent way, but that's not all.
> The framework should have first class support for other integrations
> (such as PagerDuty alerting, performance monitoring, etc) which are
> described below.
>
> 3. Pool coordination via DSL: The entire pool of VMs is
> orchestrated/coordinated by one ore more "scripts" written in a DSL,
> which is implemented as a Free Monad. Every single "operation" or
> "primitive" in your AST data type is Serializable, and when the
> framework interprets the DSL, it serializes the instruction and sends
> it over the network to a node for execution. The particular node on
> which the instruction gets executed is chosen by the platform, not the
> developer.
>
> 4. Smart resource consumption: Each node brings with it a set of
> resources. It is *not* my intention to create a system which views CPU,
> memory, etc as a contiguous unit. Rather, each primitive instruction in
> the AST is viewed as a "black box" which can only consume as much CPU
> and memory as the node has available to it. The framework is
> responsible for profiling each instruction and scheduling future
> instructions to a node for which resources are predicted to be
> available.
> The developer should be able to define new resources such as 3rd
> party API calls, bandwidth, database connections, etc, all of which are
> profiled just as CPU and memory would.
>
> 5. Browser based control panel: Engineers should have a GUI at their
> disposal which allows them to watch -- in real time -- the execution
> flow of the DSL script.
>
> 6. Structured logs with advanced filtering: All log output should be
> structured with first class support for shipping the data to
> Logstash/ElasticSearch. The aforementioned GUI should be able to
> selectively filter output based on certain pre-defined predicates and
> display them to the developer. For example, if you're building an email
> virus scanning system (which may see millions of emails per day), you
> may want to limit the real-time debugging output to only a specific
> customer.
>
> 7. First class integration with modern tools and services: The system
> should integrate with Consul, PagerDuty, statsd, RabbitMQ, memcache,
> DataDog, Logstash, and Slack, with new integrations being easy to add.
> This is vital for clean separation of business and non-business logic.
> For example, the developer should be able to cache certain bits of data
> at will, without having to worry about opening and managing a TCP
> connection to memcache.
>
> This is my vision, and I want to build it completely in Haskell. What
> do you all think?
>
> --
> Alex
> _______________________________________________
> Haskell-Cafe mailing list
> To (un)subscribe, modify options or view archives go to:
> http://mail.haskell.org/cgi-bin/mailman/listinfo/haskell-cafe
> Only members subscribed via the mailman list are allowed to post.
--
Markus Läll
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.haskell.org/pipermail/haskell-cafe/attachments/20171108/8419c82f/attachment.html>
More information about the Haskell-Cafe
mailing list