POSTS
How to read data from a kdb+ server to Jupyter or Python
Right. So your company has a kdb+ server running somewhere, and you wanna do some fancy work with the data they hold.
Here is the tutorial for that. Before we get started, please make sure that we have the following:
Requirements
- Python 2.7+ or 3.4+
- Knows how to install a Python package using PIP
- Access to a kdb+ server, knowing its hostname/IP address, and a valid user/password pair
If you do not have experience with PIP and are using Anaconda, the easiest way to do this is to open Anaconda Prompt, and type the pip commands (shown below) there. See here for more information.
Step 1: Install qPython package
This is simple. Run pip install qpython. Make sure you have called the right pip and set up any virtualenvs as needed.
Step 2: Create connection to server
Now, create a brand new Python script, import the relevant classes and create a connection to the kdb+ server:
from qpython import qconnection
q = qconnection.QConnection(host = 'localhost', port = 5000, username = 'tu', password = 'secr3t')
try:
q.open()
# The connection is ready to use
finally:
q.close()
The first line of code imports the relevant classes from the package. The second line creates an abstract connection to the server – except that it hasn’t been actually connected, until the open() call.
The try block wraps around the connection, and ensures that the connection is closed no matter what happened to the application. Everything that has to do with the connection, has to go into the try block.
Step 3: Get data
Now suppose you want to get everything in the trades table into a variable called trades in Python. Here is how you do it:
trades = q.sync('trades')
The shape of the result is a list of tuples:
rec.array([(b'msft', 743.85915275, 3162, 748428),
(b'msft', 641.73070029, 2917, 6296936),
(b'hsbc', 838.23114205, 1492, 1523210),
(b'samsung', 278.34978381, 1983, 1778945),
(b'ibm', 838.64713156, 4006, 1501018)],
dtype=[('sym', 'S7'), ('price', '<f8'), ('size', '<i8'), ('time', '<i4')])
Confused? Let’s change that into a list explicitly:
>>> list(trades)
[(b'msft', 743.8591527482495, 3162, 748428), (b'msft', 641.7307002916932, 2917, 6296936), (b'hsbc', 838.2311420543119, 1492, 1523210), (b'samsung', 278.34978381358087, 1983, 1778945), (b'ibm', 838.6471315585077, 4006, 1501018)]
The trades that goes between the two quote marks is actually a q statement.
What that means is that, if you want to do some “advanced” features before loading them into Python (e.g. only get the stock price history of MSFT), you can do that by using the right statement.
How do you do this? This is way, way beyond the scope of this blog post, as q is well-known for being a huge headache to learn. There is a whole book dedicated for this: Q for Mortals.
Step 4: Put together
Now we can put everything together in a single script:
from qpython import qconnection
q = qconnection.QConnection(host = 'localhost', port = 5000, username = 'tu', password = 'secr3t')
try:
q.open()
trades = q.sync('trades')
print(trades)
finally:
q.close()