项目作者: daskos

项目描述 :
Extensible Python Framework for Apache Mesos
高级语言: Python
项目地址: git://github.com/daskos/mentor.git
创建时间: 2016-02-14T18:05:16Z
项目社区:https://github.com/daskos/mentor

开源协议:Apache License 2.0

下载


Join the chat at https://gitter.im/daskos/mentor
Coding Hours

An extensible Mesos library for Python

aka. the distributed snake-charmer

Mentor’s intention is to simplify the process of writing python frameworks
for Mesos. Mentor provides multiple components and interfaces to cover various
levels of complexity needs.

Notable Features

  • Comfortable Pythonic interface instead of the C++ syntax
  • Magical Protobuf wrapper to easily extend messages with custom functionality
  • Multiple weighted Bin-Packing heuristics for optimized scheduling
  • Easily extensibe QueueScheduler implementation
  • Python multiprocessing.Pool interface

Install

pip install mentor or use daskos/mentor Docker image

Requirements:

  • mesos.interface (installable via pip)
  • mesos.native (binary .egg downloadable from mesosphere.io)

Configuration:

  • MESOS_MASTER=zk://127.0.0.1:2181/mesos

Examples

Futures Interface

It’s almost identical to python’s
futures interface
but runs processes on a Mesos cluster (concurrently).

  1. from mentor.apis.futures import MesosPoolExecutor
  2. from mentor.proxies.messages import Cpus, Mem
  3. with MesosPoolExecutor(name='futures-pool') as executor:
  4. def mul(a, b):
  5. return a * b
  6. future = executor.submit(mul, args=[3, 5])
  7. assert future.result(timeout=5) == 3
  8. it = executor.map(mul, range(10), range(10), timeout=5,
  9. resources=[Cpus(0.1), Mem(128)])
  10. assert list(it) == [i**2 for i in range(10)]

Multiprocessing

It’s similar to python’s
multiprocessing interface
but runs processes on a Mesos cluster (concurrently).

  1. from __future__ import print_function
  2. from mentor.apis.multiprocessing import Pool
  3. with Pool(name='mentor-pool') as pool:
  4. def mul(a, b):
  5. return a * b
  6. res1 = pool.apply_async(lambda a, b: a + b, [1, 2])
  7. res2 = pool.apply_async(mul, [2, 3])
  8. pool.wait()
  9. print(res1.get())
  10. print(res2.get())

Work Queue Scheduler

Basic scheduler to submit various kind of workloads, eg.:

  • bash commands
  • docker executable containers
  • python callables
  • customized tasks (e.g. function executed via pypy)
  1. from __future__ import print_function
  2. from mentor.scheduler import QueueScheduler, Running
  3. from mentor.messages import PythonTask
  4. from mentor.proxies.messages import Disk, Mem, Cpus
  5. scheduler = QueueScheduler()
  6. task = PythonTask(fn=sum, args=[range(10)], name='mentor-task',
  7. resources=[Cpus(0.1), Mem(128), Disk(512)])
  8. with Running(scheduler, name='mentor-scheduler'):
  9. res = scheduler.submit(task) # return AsyncResult
  10. print(res.get(timeout=30))

Custom Scheduler

You can make your own scheduler built on QueueScheduler or for more complex
needs there’s a Scheduler interface which you can use
to create one from scratch. (However in this case you’ll have to implement
some of the functionalities already in QueueScheduler)

  1. from __future__ import print_function
  2. from mentor.scheduler import QueueScheduler, Running
  3. from mentor.messages import PythonTask
  4. from mentor.proxies.messages import Disk, Mem, Cpus
  5. class CustomScheduler(QueueScheduler):
  6. def on_update(self, driver, status):
  7. """You can hook on the events defined in the Scheduler interface.
  8. They're just more conveniantly named methods for the basic
  9. mesos.interface functions but this is how you can add some
  10. custom logic to your framework in an easy manner.
  11. """
  12. logging.info(
  13. "Status update received for task {}".format(status.task_id))
  14. super(CustomScheduler, self).on_update(driver, status)
  15. scheduler = CustomScheduler()
  16. task = PythonTask(fn=sum, args=[range(9)], name='mentor-task',
  17. resources=[Cpus(0.1), Mem(128), Disk(512)])
  18. with Running(scheduler, name='mentor-custom-scheduler'):
  19. res = scheduler.submit(task)
  20. print(res.get(timeout=60))

Also this way you can easily implement your own resource offer handling logic by
overriding the on_offers(self, driver, offers) method in which we give you a
helping hand with comparable Offers and TaskInfos (basic arithmetic operators
are also overloaded).

  1. from mentor.interface import Scheduler
  2. from mentor.proxies.messages import Offer, TaskInfo
  3. class CustomScheduler(Scheduler):
  4. ...
  5. def on_offers(self, driver, offers):
  6. ...
  7. task = self.get_next_task()
  8. for offer in offers
  9. if task < offer:
  10. task.slave_id = offer.slave_id
  11. driver.launch(offer, [task])
  12. # decline unused offers or launch with empty task list
  13. ...

Optimized Task Placement

Mentor implements multiple weighted heuristics to solve the
Bin-Packing Problem:

  • First-Fit
  • First-Fit-Decreasing
  • Max-Rest
  • Best-Fit
  • Best-Fit-Decreasing

see binpack.py.

The benefits of using bin-packing has been proven by
Netflix/Fenzo in
Heterogeneous Resource Scheduling Using Apache Mesos

Built in Task Types

Command

The most basic task executes a simple command, Mesos will run CommandInfo’s
value with /bin/sh -c. Also, if you want to run your task in a Docker
container you can provide some additional information for the task.

  1. from mentor.proxies.messages import TaskInfo, CommandInfo
  2. task = TaskInfo(name='command-task', command=CommandInfo(value='echo 100'))
  3. task.container.type = 'DOCKER'
  4. task.container.docker.image = 'daskos/mentor:latest'

Python

PythonTask is capable of running arbitrary python code on
your cluster. It sends cloudpickled
methods and arguments to the matched mesos-slave for execution.
Note that python tasks run in daskos/mentor
Docker container by default.

  1. from mentor.messages import PythonTask
  2. # You can pass a function or a lambda in place of sum for fn.
  3. task = PythonTask(name='python-task', fn=sum, args=[range(5)])

Custom Task

Customs tasks can be written by extending TaskInfo
or any existing descendants.
If you’re walking down the former path you’ll most likely have to deal with
protobuf in your code; worry not, we have some magic wrappers for you to provide
customizable messages.

  1. from __future__ import print_function
  2. from mentor.proxies.messages import TaskInfo
  3. from mesos.interface import mesos_pb2
  4. class CustomTask(TaskInfo):
  5. # descriptive protobuf template the wrapper matched against
  6. proto = mesos_pb2.TaskInfo(
  7. labels=mesos_pb2.Labels(
  8. labels=[mesos_pb2.Label(key='custom')]))
  9. @property
  10. def uppercase_task_name():
  11. return self.name.upper()
  12. def on_update(self, status):
  13. logging.info('Custom task has received a status update')
  14. def custom_method(self):
  15. print("Arbitrary stuff")

One-Off Executor

This Executor implementation simply runs the received python function with the
provided arguments, then sends back the result in a reliable fashion.

  1. class OneOffExecutor(Executor):
  2. def on_launch(self, driver, task):
  3. def run_task():
  4. driver.update(task.status('TASK_RUNNING'))
  5. logging.info('Sent TASK_RUNNING status update')
  6. try:
  7. logging.info('Executing task...')
  8. result = task()
  9. except Exception as e:
  10. logging.exception('Task errored')
  11. driver.update(task.status('TASK_FAILED', message=e.message))
  12. logging.info('Sent TASK_RUNNING status update')
  13. else:
  14. driver.update(task.status('TASK_FINISHED', data=result))
  15. logging.info('Sent TASK_FINISHED status update')
  16. thread = threading.Thread(target=run_task)
  17. thread.start()

Warning (at the end)

This is a pre-release!

  • proper documentation
  • python futures api
  • more detailed examples
  • and CONTRIBUTION guide
  • dask.mesos backend

are coming!