lcdblib.pandas package¶

Submodules¶

lcdblib.pandas.utils module¶

lcdblib.pandas.utils.cartesian_product(df1, df2)[source]¶

Calculates the carteisan product and returns expanded DataFrame.

Given a pandas.DataFrame:

| sample | tissue |
|--------|--------|
| one    | ovary  |
| two    | testis |

and some set of values {‘num’: [100, 200]} build:

| sample | tissue | num |
|--------|--------|-----|
| one    | ovary  | 100 |
| one    | ovary  | 200 |
| two    | testis | 100 |
| two    | testis | 200 |

Parameters:	df1 (pandas.DataFrame) – A DataFrame that you want to expand. df2 (dict of array-like \| pandas.DataFrame \| pandas.Series) – The set of values that you want to expand df1 by.

lcdblib.pandas.utils.tidy_dataframe(df, column, sep='|')[source]¶

Given a dataframe with a delimiter-separated string, create a new dataframe with separate rows for each value in each corresponding string.

E.g.:

       gene  score
   g1|g2      1
      g3      5
g4|g5|g6      9

becomes:

tidy_dataframe(df, 'gene', sep='|')
#   gene  score
# 0   g1      1
# 0   g2      1
# 1   g3      5
# 2   g4      9
# 2   g5      9
# 2   g6      9

Contents

Previous topic

Next topic

This Page

lcdblib.pandas package¶

Submodules¶

lcdblib.pandas.utils module¶

Module contents¶