lcdblib.pandas.utils.cartesian_product(df1, df2)[source]¶Calculates the carteisan product and returns expanded DataFrame.
Given a pandas.DataFrame:
| sample | tissue |
|--------|--------|
| one | ovary |
| two | testis |
and some set of values {‘num’: [100, 200]} build:
| sample | tissue | num |
|--------|--------|-----|
| one | ovary | 100 |
| one | ovary | 200 |
| two | testis | 100 |
| two | testis | 200 |
| Parameters: |
|
|---|
lcdblib.pandas.utils.tidy_dataframe(df, column, sep='|')[source]¶Given a dataframe with a delimiter-separated string, create a new dataframe with separate rows for each value in each corresponding string.
E.g.:
gene score
0 g1|g2 1
1 g3 5
2 g4|g5|g6 9
becomes:
tidy_dataframe(df, 'gene', sep='|')
# gene score
# 0 g1 1
# 0 g2 1
# 1 g3 5
# 2 g4 9
# 2 g5 9
# 2 g6 9