Dear Zeyu,

Thank you for your interest in this work. Unfortunately, the datasets are quite large and cannot be shared via e-mail. Note that files under the "nsfnetbw/tfrecords/train/" directory include 889M of data.

The main difference between both datasets you mention (v0 and v1) is in the topologies they include (see README files with the descriptions). Also, 'datasets_v0' include 500 iterations for each combination of routing+traffic intensity. Note that each iteration uses a different input traffic matrix (TM) of a given traffic intensity (TI). The method to generate these traffic matrices is described in Section 4.1 of [1].  In the case of 'datasets_v1', each file includes 125 iterations. In this case, a file includes a collection of traffic matrices with a range of traffic intensities (<lower lambda max>-<upper lambda max>). Also, these latter datasets include the following information:

"5.- Average per-packet neperian logarithm of the delay over the packets transmitted in each source-destination pair".

Which can be useful to make probabilistic modeling. For instance, to parameterize a Gamma distribution that models the delay distribution on each source-destination pair.

Overall, if you want to reproduce the experiments of a paper I recommend you use the datasets used in the paper. Otherwise, you will need to modify the code to read datasets with a different format. For instance:

"Challenging the generalization capabilities of Graph Neural Networks for network modeling" -> datasets_v0 (https://github.com/knowledgedefinednetworking/NetworkModelingDatasets/tree/master/datasets_v0)

Also, for the paper "Unveiling the potential of Graph Neural Networks for network modeling and optimization in SDN" you should use the datasets at the following link:

https://github.com/knowledgedefinednetworking/Unveiling-the-potential-of-GNN-for-network-modeling-and-optimization-in-SDN/tree/master/datasets

This paper presents the first version of RouteNet (https://github.com/knowledgedefinednetworking/net2vec/tree/RouteNet-SOSR/routenet), which did not have support for variable link capacity. For this reason, in these latter datasets all the links in the different topologies have the same capacity. You can check the link capacities used in the "*.ned" files that describe each topology.


[1] Krzysztof Rusek, José Suárez-Varela, Albert Mestres, Pere Barlet-Ros, Albert Cabellos-Aparicio; "Unveiling the potential of Graph Neural Networks for network modeling and optimization in SDN," in and in ACM Symposium on SDN Research (SOSR) , pp. 140-151, 209. Link: https://github.com/knowledgedefinednetworking/Unveiling-the-potential-of-GNN-for-network-modeling-and-optimization-in-SDN


Regards,

José


On 14/02/20 08:00, tjuzeyuluan wrote:

Dear Professor,
    I’m Zeyu, a PhD student from UC Berkeley. I am really interested in your work related in Graph Neural Networks -based routing optimization. I am trying to repeat your experiment. However, the download speed from the URL(path = '/home/datasets/SIGCOMM/nsfnetbw/tfrecords/train/') is so slow. Could you please transfer the zip package via the e-mail? Thank you very much! 
    Another question is that what’s the difference between dataset v0 and dataset v1. I am a little confused. Could you please explain further? Thanks a lot!


_______________________________________________
Kdn-users mailing list
Kdn-users@knowledgedefinednetworking.org
https://mail.n3cat.upc.edu/cgi-bin/mailman/listinfo/kdn-users