Configuring birdingΒΆ

birding uses a validated configuration file for runtime details.

Configuration files use a YAML format. All values have a default (below) and accept values of the same name in the configuration file, which has a default path of birding.yml in the current working directory. If needed, the BIRDING_CONF environment variable can point to the filepath of the configuration file.

The scope of the configuration file is limited to details of birding itself, not of Storm-related topics. Storm details are in the project topology definition.

When a configuration value is a Python dotted name, it is a string reference to the Python object to import. In general, when the value is just an object name without a full namespace, its assumed to be the relevant birding namespace, e.g. LRUShelf is assumed to be birding.shelf.LRUShelf. Respective *_init configuration values specify keyword (not positional) arguments to be passed to the class constructor.

See Using birding in production for further discussion on configuration in production environments.

For advanced API usage, see get_config(). The config includes an Appendix to support any additional values not known to birding, such that these values are available in config['Appendix'] and bypass any validation. This is useful for code which uses birding’s config loader and needs to define additional values.


Spout: TermCycleSpout
  - real-time analytics
  - apache storm
  - pypi
  class: birding.twitter.TwitterSearchManagerFromOAuth
  init: {}
  shelf_class: FreshLRUShelf
  shelf_init: {}
  shelf_expiration: 300
  elasticsearch_class: elasticsearch.Elasticsearch
    - localhost: 9200
  index: tweet
  doc_type: tweet
  kafka_class: pykafka.KafkaClient
    hosts: # comma-separated list of hosts
  topic: tweet
  shelf_class: ElasticsearchShelf
  shelf_init: {}
  shelf_expiration: null
Appendix: {}