Solutions

Try figuring out the solutions yourself before looking them up here. All documentation required to answer the questions are located in the hyperlinks in the tutorial.

Running anuran on the FlashWeave networks

If you run anuran, the software will warn you that there are not enough networks to do a thorough statistical analysis. However, it does export a file with the intersection (overlap) between the networks and the null models. Since we used the -draw flag, anuran will already plot some of the results. The most informative image is the one called 'demo_sponges_setsizes.png'. While the null models are random, your output should look somewhat similar to the one below.

As you can see, the difference in particular is smaller for the real data (labelled 'Input') compared to the null models. However, the intersection is close to zero. The reason that this is happening, is that there simply are no associations that are conserved across all networks! We can adapt the -size parameter to tackle this. At 10 networks, a size of 0.2 will look at overlaps between 2 networks.

anuran -i folder -o demo -draw -size 0.2 0.3 0.5 0.7 1

Showing changes in edges across networks

To plot changes across networks, we only need

library(ggplot2)

data <- read.csv('demo_sets.csv')
data <- data[!is.na(data$Set.type..absolute.), ]
data$Network <- factor(data$Network, c('Input', 'Degree', 'Random'))
ggplot(data, aes(x=Set.type..absolute., y=Set.size, colour=Network)) + geom_point() + theme_minimal() + labs(x='Edges present in at least this number of networks', y='Number')

Positive controls

To include positive controls, we need to add two parameters. One is the core prevalence (-prev), which describes across how many networks edges appear. The other is the core size, which describes how large the core is relative to the total network. You can include more than one value for both.

anuran -i folder -o demo -draw -size 0.2 0.3 0.5 0.7 1 -cs 0.2 0.5 -prev 0.2 0.5

We can use the same plotting function as above to generate the figure. The figure looks a bit confusing and could use some additional polishing, but as you can see, we can clearly make out the different groups of positive controls! The core size parameter matters for the number of edges, while the prevalence means there are still some conserved edges left even at 7 networks.

data <- read.csv('demo_core_sets.csv')
data <- data[!is.na(data$Set.type..absolute.), ]
ggplot(data, aes(x=Set.type..absolute., y=Set.size, colour=Network)) + geom_point() + theme_minimal() + labs(x='Edges present in at least this number of networks', y='Number')