DJCron – A distributed cron. – Demo time!



DJCron – A distributed cron. – Demo time!

0 0


djcron-slides

Presentation of djcron

On Github djcron-project / djcron-slides

DJCron

A distributed cron.

About me

What?

  • DJCron is a distributed cron

Why?

  • Cron was developed by Brian Kernigham.
  • ... at 1979.
  • Successfully working until now.

Cron at 1979

# Task to do nothing
* * * * * root do_some_stuff
* */5 * * * root do_more_stuff
# third command
*/3 * * * * root take_on_the_world
          

Cron at 2015

# Task to do nothing
* * * * * root do_some_stuff
* */5 * * * root do_more_stuff
# third command
*/3 * * * * root take_on_the_world
          

Current requirements are quite different!

1979 2015 One machine Several clusters Simple scripts Complex programs Execution by user

Different execution requirements:

  • Cluster
  • Service
  • Several machines

Logs at /var/logs Distributed logs Deployment with scripts Orchestration SysAdmins DevOps/SRE

How

  • Web UI
  • Agents to run tasks
  • Message queues
  • Python
  • Django
  • Celery
    • RabbitMQ
    • Redis
    • MongoDB
  • Djcelery
  • Bootstrap
  • d3
  • metric-graphics

DevOps mind

  • Requirements change!
  • Developers must be able to create their crons.
  • Developers are the ultimate responsibles
  • Help developers to help themselves!

Golden rule

Trust the developer......but log everything!

Demo time!

bar

How does it work

On edit, it configures Djcelery Djcelery scheduler does the main trick Just one real task that creates a canvas

The canvas

try:
    task_main()     # Runs on agent
    task_success()  # Runs on server
except:
    task_failure()  # Runs on server
            

Really?

NO!

The real canvas

@celery.task.task(name='Cron worker')
def djcron_worker(job_id, *args, **kwargs):
    """ Launches a Job """

    exec_id = uuid().hex
    LOGGER.debug('[%s] Launching job', exec_id)

    load = djcron_retrieve_script.si(exec_id=exec_id, job_id=job_id)
    run = run_script.s(exec_id=exec_id)
    store = djcron_save_results.s(exec_id=exec_id, job_id=job_id)
    error = djcron_error.s(exec_id=exec_id, job_id=job_id)

    chain = load | run | store
    chain.link_error = error
    chain.delay()
            

This should be improved to allow more customization:

  • Specific node
  • Operating system
  • User
  • Cluster name
  • Just one/all of them

Dreaming The future

  • Tests
  • Where to be run customization
  • Resources monitoring (CPU, Memory, network, ...)
  • Alerts (task/host failing too much, ...)
  • Per user permissions
  • Dockers/VMs

Contact