Extracting

About Extracting

After importing, the data can be queried through the command line. For example:

mbpy EXTRACT from-users PRINT

That will produce screens that display the first 5, and last 5 student records, and every column in the student table for those records.

        id      role   oa_id sb_id account_uid
0    10752543    Admin  None  None     None     \
1    10752544    Admin  None  None     None
2    10752545  Advisor  None  None     None
3    10752546    Admin  None  None     None
4    10752547  Advisor  None  None     None      

Scroll down to see more columns for these records. (Notice the \ indicates there is wraparound.)

These commands can be used to just get data displayed, but it becomes interesting when paired with available commands to perform manipulations. For example, to find the distribution of users' domains, we could do the following:

mbpy \
    EXTRACT \
        from-users \
            --fields email \
    INFO \
        --value-count domain \
    extract-on \
        --column email \
        --pattern '^(?P<handle>.+)@(?P<domain>.+)$'

This school is supposed to have only eduvo.com accounts. Looks like we have a few users with wrong email addresses! Let's find them:

Let's see what we got:

Let's limit to only those that are active (not archived), and let's give ourselves a link we can click:

Command-click to load the url in the browser!

The structure of these commands are the following:

Command / subcommand
Explanation
Comment

<MODE>

Either EXTRACT or STREAM

In extraction mode, it loads all the information into memory. In stream mode, it loads by chunks.

<extractor>

from-students, from-classes, etc.

mbpy EXTRACT --help to find full list

<streamer>

See "Streaming" section

mbpy STREAM --help to find full list

<LOADER>

print or pprint or csv

mbpy EXTRACT from-students --help to see full list of Loaders.

<chain>

Peform manipulations after extraction.

Optional.

To learn more about how to query with extractors, continue to Querying Commands:

Querying Commands

Last updated

Was this helpful?