diff --git a/doc/source/getting_started.rst b/doc/source/getting_started.rst new file mode 100644 index 0000000..9ec35d5 --- /dev/null +++ b/doc/source/getting_started.rst @@ -0,0 +1,480 @@ +Getting started with YAQL +========================= + +Introduction to YAQL +-------------------- + +YAQL (Yet Another Query Language) is an embeddable and extensible query +language that allows performing complex queries against arbitrary data structures. +`Embeddable` means that you can easily integrate a YAQL query processor in your code. Queries come +from your DSLs (domain specific language), user input, JSON, and so on. YAQL has a +vast and comprehensive standard library of functions that can be used to query data of any complexity. +Also, YAQL can be extended even further with user-specified functions. +YAQL is written in Python and is distributed through PyPI. + +YAQL was inspired by Microsoft LINQ for Objects and its first aim is to execute expressions +on the data in memory. A YAQL expression has the same role as an SQL query to databases: +search and operate the data. In general, any SQL query can be transformed to a YAQL expression, +but YAQL can also be used for computational statements. For example, `2 + 3*4` is a valid +YAQL expression. + +Moreover, in YAQL, the following operations are supported out of the box: + +* Complex data queries +* Creation and transformation of lists, dicts, and arrays +* String operations +* Basic math operations +* Conditional expression +* Date and time operations (will be supported in yaql 1.1) + +An interesting thing in YAQL is that everything is a function and any function can +be customized or overridden. This is true even for built-in functions. +YAQL cannot call any function that was not explicitly registered to be accessible +by YAQL. The same is true for operators. + +YAQL can be used in two different ways: as an independent CLI tool, and as a +Python module. + +Installation +------------ + +You can install YAQL in two different ways: + +#. Using PyPi: + + .. code-block:: console + + pip install yaql + +#. Using your system package manager (for example Ubuntu): + + .. code-block:: console + + sudo apt-get install python-yaql + +HowTo: Use YAQL in Python +------------------------- + +You can operate with YAQL from Python in three easy steps: + +* Create a YAQL engine +* Parse a YAQL expression +* Execute the parsed expression + +.. NOTE:: + The engine should be created once for a set of operators and parser rules. It can + be reused for all queries. + +Here is an example how it can be done with the YAML file which looks like: + +.. code-block:: yaml + + customers_city: + - city: New York + customer_id: 1 + - city: Saint Louis + customer_id: 2 + - city: Mountain View + customer_id: 3 + customers: + - customer_id: 1 + name: John + orders: + - order_id: 1 + item: Guitar + quantity: 1 + - customer_id: 2 + name: Paul + orders: + - order_id: 2 + item: Banjo + quantity: 2 + - order_id: 3 + item: Piano + quantity: 1 + -customer_id: 3 + name: Diana + orders: + - order_id: 4 + item: Drums + quantity: 1 + +.. code-block:: python + + import yaql + import yaml + + data_source = yaml.load(open('shop.yaml', 'r')) + + engine = yaql.factory.YaqlFactory().create() + + expression = engine( + '$.customers.orders.selectMany($.where($.order_id = 4))') + + order = expression.evaluate(data=data_source) + +Content of the ``order`` will be the following: + +.. code-block:: console + + [{u'item': u'Drums', u'order_id': 4, u'quantity': 1}] + +YAQL grammar +------------ + +YAQL has a very simple grammar: + +* Three keywords as in JSON: true, false, null +* Numbers, such as 12 and 34.5 +* Strings: `'foo'` and `"bar"` +* Access to the data: $variable, $ +* Binary and unary operators: 2 + 2, -1, 1 != 2, $list[1] + +Data access +~~~~~~~~~~~ + +Although YAQL expressions may be self-sufficient, the most important value of YAQL +is its ability to operate on user-passed data. Such data is placed into variables +which are accessible in a YAQL expression as `$`. The `variable_name` +can contain numbers, English alphabetic characters, and underscore symbols. The `variable_name` +can be empty, in this case you will use `$`. Variables can be set prior to executing +a YAQL expression or can be changed during the execution of some functions. + +According to the convention in YAQL, function parameters, including input data, +are stored in variables like `$1`, `$2`, and so on. The `$` stands for `$1`. +For most cases, all function parameters are passed in one piece and can be accessed +using `$`, that is why this variable is the most used one in YAQL expressions. +Besides, some functions are expected to get a YAQL expression as one of the +parameters (for example, a predicate for collection sorting). In this case, +passed expression is granted access to the data by `$`. + +Strings +~~~~~~~ + +In YAQL, strings can be enclosed in `"` and `'`. Both types are absolutely equal and +support all standard escape symbols including unicode code-points. In YAQL, both types +of quotes are useful when you need to include one type of quotes into the +other. In addition, ` is used to create a string where only one escape symbol \` is possible. +This is especially suitable for regexp expressions. + +If a string does not start with a digit or `__` and contains only digits, `_`, and English letters, +it is called identifier and can be used without quotes at all. An identifier can be used +as a name for function, parameter or property in `$obj.property` case. + +Functions +~~~~~~~~~ + +A function call has syntax of `functionName(functionParameters)`. Brackets are necessary +even if there are no parameters. In YAQL, there are two types of parameters: + +* Positional parameters + ``foo(1, 2, someValue)`` +* Named parameters + ``foo(paramName1 => value1, paramName2 => 123)`` + +Also, a function can be called using both positional and named parameters: ``foo(1, false, param => null)``. +In this case, named arguments must be written after positional arguments. In +``name => value``, `name` must be a valid identifier and must match the name of +parameter in function definition. Usually, arguments can be passed in both ways, +but named-only parameters are supported in YAQL since Python 3 supports them. + +Parameters can have default values. Named parameters is a good way to pass only needed +parameters and skip arguments which can be use default values, also you can simply +skip parameters in function call: ``foo(1,,3)``. + +In YAQL, there are three types of functions: + +* Regular functions: ``max(1,2)`` +* Method-like functions, which are called by specifying an object for which the + function is called, followed by a dot and a function call: ``stringValue.toUpper()`` +* Extension methods, which can be called both ways: ``len(string)``, ``string.len()`` + +YAQL standard library contains hundreds of functions which belong to one of these types. +Moreover, applications can add new functions and override functions from the standard library. + +Operators +~~~~~~~~~ + +YAQL supports the following types of operators out of the box: + +* Arithmetic: `+`. `-`, `*`, `/`, `mod` +* Logical: `=`, `!=`, `>=`, `<=`, `and`, `or`, `not` +* Regexp operations: `=~`, `!~` +* Method call, call to the attribute: `.`, `?.` +* Context pass: `->` +* Indexing: `[ ]` +* Membership test operations: `in` + +Data structures +~~~~~~~~~~~~~~~ + +YAQL supports these types out of the box: + + +* Scalars + + YAQL supports such types as string, int. boolean. Datetime and timespan + will be available after yaql 1.1 release. + +* Lists + + List creation: ``[1, 2, value, true]`` + Alternative syntax: ``list(1, 2, value, true)`` + List elemenets can be accesessed by index: ``$list[0]`` + +* Dictionaries + + Dict creation: ``{key1 => value1, true => 1, 0 => false}`` + Alternative syntax: ``dict(key1 => value1, true => 1, 0 => false)`` + Dictionaries can be indexed by keys: ``$dict[key]``. Exception will be raised + if the key is missing in the dictionary. Also, you can specify value which will + be returned if the key is not in the dictionary: ``dict.get(key, default)``. + + .. NOTE:: + During iteration through the dictionary, `key` can be called like: ``$.key`` + +* (Optional) Sets + + Set creation: ``set(1, 2, value, true)`` + +.. NOTE:: + YAQL is designed to keep input data unchanged. All the functions that + look as if they change data, actually return an updated copy and keep the original + data unchanged. This is one reason why YAQL is thread-safe. + +Basic YAQL query operations +--------------------------- + +It is obvious that we can compare YAQL with SQL as they both are designed to solve +similar tasks. Here we will take a look at the YAQL functions which have a direct +equivalent with SQL. + +We will use YAML from `HowTo: use YAQL in Python`_ as a data source in our examples. + + +Filtering +~~~~~~~~~ + +.. NOTE:: + + Analog is SQL WHERE + +The most common query to the data sets is filtering. This is a type of +query which will return only elements for which the filtering query is true. In YAQL, +we use ``where`` to apply filtering queries. + +.. code-block:: console + + yaql> $.customers.where($.name = John) + +.. code-block:: yaml + + - customer_id: 1 + name: John + orders: + - order_id: 1 + item: Guitar + quantity: 1 + + +Ordering +~~~~~~~~ + +.. NOTE:: + + Analog is SQL ORDER BY + +It may be required to sort the data returned by some YAQL query. The ``orderBy`` clause will cause +the elements in the returned sequence to be sorted according to the default comparer +for the type being sorted. For example, the following query can be extended to sort +the results based on the profession property. + +.. code-block:: console + + yaql> $.customers.orderBy($.name) + +.. code-block:: yaml + + - customer_id: 3 + name: Diana + orders: + - order_id: 4 + item: Drums + quantity: 1 + - customer_id: 1 + name: John + orders: + - order_id: 1 + item: Guitar + quantity: 1 + - customer_id: 2 + name: Paul + orders: + - order_id: 2 + item: Banjo + quantity: 2 + - order_id: 3 + item: Piano + quantity: 1 + +Grouping +~~~~~~~~ + +.. NOTE:: + + Analog is SQL GROUP BY + +The ``groupBy`` clause allows you to group the results according to the key you specified. +Thus, it is possible to group example json by gender. + +.. code-block:: console + + yaql> $.customers.groupBy($.name) + +.. code-block:: yaml + + - Diana: + - customer_id: 3 + name: Diana + orders: + - order_id: 4 + item: Drums + quantity: 1 + - Paul: + - customer_id: 2 + name: Paul + orders: + - order_id: 2 + item: Banjo + quantity: 2 + - order_id: 3 + item: Piano + quantity: 1 + - John: + - customer_id: 1 + name: John + orders: + - order_id: 1 + item: Guitar + quantity: 1 + +So, here you can see the difference between ``groupBy`` and ``orderBy``. We use +the same parameter `name` for both operations, but in the output for ``groupBy`` +`name` is located in additional place before everything else. + +Selecting +~~~~~~~~~ + +.. NOTE:: + + Analog is SQL SELECT + +The ``select`` method allows building new objects out of objects of some collection. +In the following example, the result will contain a list of name/orders pairs. + +.. code-block:: console + + yaql> $.customers.select([$.name, $.orders]) + +.. code-block:: console + + - John: + - order_id: 1 + item: Guitar + quantity: 1 + - Paul: + - order_id: 2 + item: Banjo + quantity: 2 + - order_id: 3 + item: Piano + quantity: 1 + - Diana: + - order_id: 4 + item: Drums + quantity: 1 + +Joining +~~~~~~~ + +.. NOTE:: + + Analog is SQL JOIN + +The ``join`` method creates a new collection by joining two other collections by +some condition. + +.. code-block:: console + + yaql> $.customers.join($.customers_city, $1.customer_id = $2.customer_id, {customer=>$1.name, city=>$2.city, orders=>$1.orders}) + +.. code-block:: yaml + + - customer: John + city: New York + orders: + - order_id: 1 + item: Guitar + quantity: 1 + - customer: Paul + city: Saint Louis + orders: + - order_id: 2 + item: Banjo + quantity: 2 + - order_id: 3 + item: Piano + quantity: 1 + - customer: Diana + city: Mountain View + orders: + - order_id: 4 + item: Drums + quantity: 1 + + +Take an element from collection +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +YAQL supports two general methods that can help you to take elements from collection +``skip`` and ``take``. + +.. code-block:: console + + yaql> $.customers.skip(1).take(2) + +.. code-block:: yaml + + - customer_id: 2 + name: Paul + orders: + - order_id: 2 + item: Banjo + quantity: 2 + - order_id: 3 + item: Piano + quantity: 1 + - customer_id: 3 + name: Diana + orders: + - order_id: 4 + item: Drums + quantity: 1 + +First element of collection +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The ``first`` method will return the first element of a collection. + +.. code-block:: console + + yaql> $.customers.first() + +.. code-block:: yaml + + - customer_id: 1 + name: John + orders: + - order_id: 1 + item: Guitar + quantity: 1 diff --git a/doc/source/index.rst b/doc/source/index.rst index f0aca26..6484f0d 100644 --- a/doc/source/index.rst +++ b/doc/source/index.rst @@ -14,6 +14,7 @@ Introduction readme what_is_yaql + getting_started Usage ~~~~~