Hi Nathan,
I answer to your questions below.
- Is the delay and jitter data in all three versions of the nsfnetbw and geant2bw topology
data sets the same?
Every dataset has some particularities. The dataset used in the ACM SOSR paper
(
)
assumes that all the links have the same capacity. Note that the model of RouteNet
presented in this paper did not have support to encode the link capacity. However, in the
two other datasets (datasets_v0 and datasets_v1) the topologies have variable capacity on
links.
As you noted, the topologies vary also between 'datasets_v0' and
'datasets_v1'. Also, for those topologies that are present in both datasets
(nsfnet, geant) the simulation samples change. They have different combinations of traffic
matrices and routings. Also, in 'datasets_v0' we consider discrete values of
traffic intensity (<lambda>) for each simulation file
(e.g.,|results_<topology_name>_<lambda>_<routing_scheme>.tar.gz|), while
in 'datasets_v1' each simulation file contains samples simulated within a
continuous range of traffic intensities ([lower lambda, upper lambda]) that are described
in the file name (|results_<topology_name>_<lower lambda max>-<upper lambda
max>_<routing_scheme>_0_124.tar.gz|).
- Is the gbnbw topology from datasets_v1 the same as the GBN topology from the ACM SOSR
paper, and are the delay and jitter the same?
It is the same topology, but in 'datasets_v1' the link capacity is variable. Also,
the simulation samples (i.e., traffic and routing) are different. Consequently, the
distribution of delay and jitter is different.
- Why do we have a germany50bw instead of the synth50bw topology?
Basically, we decided to use a real-world 50-node network topology (extracted from
) instead of
the synthetically-generated 'synth50bw' topology.
- Can I generally assume that the datasets_v1 supersedes the previous dataset versions in
the sense that the training data is exactly the same, when the topologies are the same?
No, because of the reasons explained above.
- Can I generally assume that the later format of the data, in tar.gz files, supersedes
any earlier format?
The data format will probably evolve as we continue making research over RouteNet. For
instance, 'datasets_v1' were used in our new paper
(
), where we propose an extended RouteNet model that is
able to estimate the delay distribution directly, instead of having two separate models
that predict the mean delay and jitter. You can find the code used in this paper at the
following link:
)
that facilitates to extract the data from the datasets used in the paper (i.e.,
'datasets_v1').
Regards,
José
El 27/10/19 a las 18:43, Nathan Sowatskey escribió:
Apologies for replying to my own email. I have seen
that the V1 data sets differ from V0 in that, at least, we have a fifth field is the
“Average per-packet neperian logarithm of the delay over the packets transmitted in each
source-destination pair.”.
I am assuming a version of the sample code has been updated to reflect this change, but I
am unclear about which code.
Are there other differences we should be aware of please?
Many thanks
Nathan
> On 27 Oct 2019, at 14:44, Nathan Sowatskey <nathan(a)nathan.to> wrote:
>
> I have had the opportunity to look at this in greater detail now, so I have some
follow up questions, which are below.
>
> I should explain that I a bit challenged with this as the reference in the ACM SOSR
paper (
https://dl.acm.org/citation.cfm?id=3314357) for the datasets is:
>
> [3] 2019. Knowledge-Defined Networking.
https://github.com/
knowledgedefinednetworking.
>
> There are no data sets at that link per se, but you have, I think, explained that the
AM SOSR data sets are at:
>
>
https://github.com/knowledgedefinednetworking/Unveiling-the-potential-of-GN…
>
> I have, also, just found this link:
>
>
https://github.com/knowledgedefinednetworking/NetworkModelingDatasets/tree/…
>
> Which was created 25 days ago.
>
> The v1 datasets includes these topologies:
>
> - NSFNet topology:
http://knowledgedefinednetworking.org/data/datasets_v1/nsfnetbw.tar.gz
> - GBN topology:
http://knowledgedefinednetworking.org/data/datasets_v1/gbnbw.tar.gz
> - GEANT2 topology:
http://knowledgedefinednetworking.org/data/datasets_v1/geant2bw.tar.gz
> - germany50 topology:
http://knowledgedefinednetworking.org/data/datasets_v1/germany50bw.tar.gz
>
> And that these topologies are in the (later) format, used in the ACM SIGCOMM paper
(
https://dl.acm.org/citation.cfm?id=3342327) that supersedes the ACM SOSR paper
(
https://dl.acm.org/citation.cfm?id=3314357), and with the code here:
>
>
https://github.com/knowledgedefinednetworking/demo-routenet/blob/master/cod…
> So, I *think* that the most up-to-date data sets are the datasets_v1, and that these
datasets_v1 encompass the nsfnetbw and geant2bw topologies from both the ACM SOSR paper
and datasets_v0
(
https://github.com/knowledgedefinednetworking/NetworkModelingDatasets/tree/…).
>
> What I don’t know is if the delay and jitter data is the same in the datasets_v0,
datasets_v1 and the data sets from the ACM SOSR paper.
>
> I can further see that there is a gbnbw topology, which looks like it is the same as
the GBN topology in the ACM SOSR paper.
>
> Then we have, in datasets_v1, a germany50bw topology, whereas in datasets_v0 we have
a synth50bw topology. Looking at the topology diagrams, these appear to be quite different
networks. So you seem to have dropped the synth50bw and included the germany50bw.
>
> Questions:
>
> Is the delay and jitter data in all three versions of the nsfnetbw and geant2bw
topology data sets the same?
>
> Is the gbnbw topology from datasets_v1 the same as the GBN topology from the ACM SOSR
paper, and are the delay and jitter the same?
>
> Why do we have a germany50bw instead of the synth50bw topology?
>
> Can I generally assume that the datasets_v1 supersedes the previous dataset versions
in the sense that the training data is exactly the same, when the topologies are the same?
>
> Can I generally assume that the later format of the data, in tar.gz files, supersedes
any earlier format?
>
> Many thanks
>
> Nathan
>
>
>> On 8 Oct 2019, at 18:07, José Suárez-Varela <jsuarezv(a)ac.upc.edu> wrote:
>>
>> Hi Nathan,
>>
>> In the ACM SOSR paper we only train the model with 260,000 training samples from
the NSF network topology and evaluate it on 100,000 samples simulated in the GBN and
GEANT2 topologies. You can find the datasets used in this paper at the following link:
>>
>>
https://github.com/knowledgedefinednetworking/Unveiling-the-potential-of-GN…
>>
>> Please, do not confuse these datasets with the ones that we used in our ACM
SIGCOMM demo paper ("Challenging the generalization capabilities of Graph Neural
Networks for network modeling"), which are on this link:
>>
>>
https://github.com/knowledgedefinednetworking/NetworkModelingDatasets/tree/…
>>
>> We made evaluations (internally) to train the jitter model from scratch and it
works perfectly. However, in the ACM SOSR paper we wanted to show the possibility to make
transfer learning from a model trained (in an early stage) to learn the delay and retrain
it to model the jitter. This typically enables to save training time.
>>
>>
>> Regards,
>>
>> José
>>
>> El 6/10/19 a las 17:26, Nathan Sowatskey escribió:
>>> Jose, following up now that I have the ACM version of the paper.
>>>
>>> I can see that you are testing with both the GBN and Geant2 networks.
>>>
>>> You also appear to say that you train only with the NSF network, and so you
do not train with the synth50bw network. Is that correct?
>>>
>>> Also, it looks like you have not trained a jitter model from scratch, as you
explained that the jitter model "was trained from a model previously trained for the
delay”. Training a jitter model from scratch is one of the aspects I should like to
explore, so I wanted to understand this aspect better.
>>>
>>> Many thanks
>>>
>>> Nathan
>>>
>>>> On 25 Sep 2019, at 14:17, Nathan Sowatskey <nathan(a)nathan.to>
wrote:
>>>>
>>>> Great, thanks for this. I am trying to get the ACM version of the paper
now.
>>>>
>>>> Regards
>>>>
>>>> Nathan
>>>>
>>>>> On 25 Sep 2019, at 11:55, Jose Suárez-Varela
<jsuarezv(a)ac.upc.edu> wrote:
>>>>>
>>>>> Dear Nathan,
>>>>>
>>>>> Probably you read our work-in-progress version uploaded at ArXiv.
Please, check the last version published in the proceedings of ACM SOSR
(
https://dl.acm.org/citation.cfm?id=3314357). Here, we make the evaluation also in GBN.
>>>>>
>>>>> Sorry for the possible misunderstanding. We uploaded the README page
(
https://github.com/knowledgedefinednetworking/Unveiling-the-potential-of-GN…)
to provide the link to ACM SOSR.
>>>>>
>>>>> Regarding the GBN topology, unfortunately we didn't prepare a
figure. However, you can find an image of this topology at the following paper (Figure 4):
>>>>>
>>>>> J. Pedro, J. Santos, and J. Pires, “Performance evaluation of
integrated otn/dwdm networks with single-stage multiplexing of optical channel data
units,” in Proceedings of ICTON, 2011, pp. 1–4.
>>>>>
>>>>> I hope it will be useful.
>>>>>
>>>>>
>>>>> Regards,
>>>>>
>>>>> José
>>>>>
>>>>>
>>>>> El 25/09/2019 a las 12:14, Nathan Sowatskey escribió:
>>>>>> Thank you Jose. I have read the paper (many times :-)). I have
seen the details of the evaluation with the Geant2 network, but there is no mention of the
GBN network in the paper.
>>>>>>
>>>>>> I am perfectly comfortable with processing the data (you can see
my code here:
https://github.com/Data-Science-Projects/demo-routenet).
>>>>>>
>>>>>> Specifically for the GBN network, I wanted to see what the
topology looks like. I have the NED file, but I can’t use that NED file with OMNet (for
reasons discussed elsewhere).
>>>>>>
>>>>>> I can, of course, manually reverse engineer the NED file. But I
wanted to ask if there was already a topology diagram just to save me the effort.
>>>>>>
>>>>>> Regards
>>>>>>
>>>>>> Nathan
>>>>>>
>>>>>>> On 25 Sep 2019, at 11:07, Jose Suárez-Varela
<jsuarezv(a)ac.upc.edu> wrote:
>>>>>>>
>>>>>>> Hello Nathan,
>>>>>>>
>>>>>>> All these datasets where used in our paper:
>>>>>>>
>>>>>>> Krzysztof Rusek, José Suárez-Varela, Albert Mestres, Pere
Barlet-Ros, Albert Cabellos-Aparicio; "Unveiling the potential of Graph Neural
Networks for network modeling and optimization in SDN," in Proceedings of ACM
Symposium on SDN Research (SOSR) , pp. 140-151, April 2019.
>>>>>>>
>>>>>>> Particularly, we trained RouteNet only with samples of the
NSFNET dataset to predict the delay and jitter. Then, we evaluate the accuracy of the
models already trained. This evaluation is made separately on the three datasets (NSFNET,
GBN and GEANT2) to test the generalization capability of the model.
>>>>>>>
>>>>>>> Please, find more details in Section 4 (Evaluation of the
accuracy of the GNN model) of the paper.
>>>>>>>
>>>>>>> Also, you can find information on how to process the datasets
at the following link:
>>>>>>>
>>>>>>>
http://knowledgedefinednetworking.org/data/README_gnn.pdf
>>>>>>>
>>>>>>>
>>>>>>> Regards,
>>>>>>>
>>>>>>> José
>>>>>>>
>>>>>>> El 22/09/2019 a las 16:55, Nathan Sowatskey escribió:
>>>>>>>> Hi
>>>>>>>>
>>>>>>>> On this page:
>>>>>>>>
>>>>>>>>
https://github.com/knowledgedefinednetworking/Unveiling-the-potential-of-GN…
>>>>>>>>
>>>>>>>> I have seen that there is this data set:
>>>>>>>>
>>>>>>>>
http://knowledgedefinednetworking.org/data/GBN.zip
>>>>>>>>
>>>>>>>> It is described as having been used for evaluation, but I
can’t find anything else that refers to it.
>>>>>>>>
>>>>>>>> Can anyone tell me more please?
>>>>>>>>
>>>>>>>> Many thanks
>>>>>>>>
>>>>>>>> Nathan
>>>>>>>> _______________________________________________
>>>>>>>> Kdn-users mailing list
>>>>>>>> Kdn-users(a)knowledgedefinednetworking.org
>>>>>>>>
https://mail.n3cat.upc.edu/cgi-bin/mailman/listinfo/kdn-users
>>>> _______________________________________________
>>>> Kdn-users mailing list
>>>> Kdn-users(a)knowledgedefinednetworking.org
>>>>
https://mail.n3cat.upc.edu/cgi-bin/mailman/listinfo/kdn-users