On 28 Oct 2019, at 16:11, José Suárez-Varela
<jsuarezv(a)ac.upc.edu> wrote:
Hi Nathan,
I answer to your questions below.
- Is the delay and jitter data in all three versions of the nsfnetbw and geant2bw
topology data sets the same?
Every dataset has some particularities. The dataset used in the ACM SOSR paper
(
https://github.com/knowledgedefinednetworking/Unveiling-the-potential-of-GN…
<https://github.com/knowledgedefinednetworking/Unveiling-the-potential-of-GNN-for-network-modeling-and-optimization-in-SDN/tree/master/datasets>)
assumes that all the links have the same capacity. Note that the model of RouteNet
presented in this paper did not have support to encode the link capacity. However, in the
two other datasets (datasets_v0 and datasets_v1) the topologies have variable capacity on
links.
As you noted, the topologies vary also between 'datasets_v0' and
'datasets_v1'. Also, for those topologies that are present in both datasets
(nsfnet, geant) the simulation samples change. They have different combinations of traffic
matrices and routings. Also, in 'datasets_v0' we consider discrete values of
traffic intensity (<lambda>) for each simulation file (e.g.,
results_<topology_name>_<lambda>_<routing_scheme>.tar.gz), while in
'datasets_v1' each simulation file contains samples simulated within a continuous
range of traffic intensities ([lower lambda, upper lambda]) that are described in the file
name (results_<topology_name>_<lower lambda max>-<upper lambda
max>_<routing_scheme>_0_124.tar.gz).
- Is the gbnbw topology from datasets_v1 the same as the GBN topology from the ACM SOSR
paper, and are the delay and jitter the same?
It is the same topology, but in 'datasets_v1' the link capacity is variable.
Also, the simulation samples (i.e., traffic and routing) are different. Consequently, the
distribution of delay and jitter is different.
- Why do we have a germany50bw instead of the synth50bw topology?
Basically, we decided to use a real-world 50-node network topology (extracted from
http://sndlib.zib.de/home.action?show=/germany50.overview.action%3Fframeset
<http://sndlib.zib.de/home.action?show=/germany50.overview.action%3Fframeset>)
instead of the synthetically-generated 'synth50bw' topology.
- Can I generally assume that the datasets_v1 supersedes the previous dataset versions in
the sense that the training data is exactly the same, when the topologies are the same?
No, because of the reasons explained above.
- Can I generally assume that the later format of the data, in tar.gz files, supersedes
any earlier format?
The data format will probably evolve as we continue making research over RouteNet. For
instance, 'datasets_v1' were used in our new paper
(
https://arxiv.org/abs/1910.01508 <https://arxiv.org/abs/1910.01508>), where we
propose an extended RouteNet model that is able to estimate the delay distribution
directly, instead of having two separate models that predict the mean delay and jitter.
You can find the code used in this paper at the following link:
https://github.com/knowledgedefinednetworking/Papers/wiki/RouteNet:-Leverag…
<https://github.com/knowledgedefinednetworking/Papers/wiki/RouteNet:-Leveraging-GNN-for-network-modeling-and-optimization-in-SDN>
In this case, we provided a parser
(
https://github.com/knowledgedefinednetworking/net2vec/blob/RouteNet-JSAC19/…
<https://github.com/knowledgedefinednetworking/net2vec/blob/RouteNet-JSAC19/routenet/upcdataset.py>)
that facilitates to extract the data from the datasets used in the paper (i.e.,
'datasets_v1').
Regards,
José
El 27/10/19 a las 18:43, Nathan Sowatskey escribió:
> Apologies for replying to my own email. I have seen that the V1 data sets differ from
V0 in that, at least, we have a fifth field is the “Average per-packet neperian logarithm
of the delay over the packets transmitted in each source-destination pair.”.
>
> I am assuming a version of the sample code has been updated to reflect this change,
but I am unclear about which code.
>
> Are there other differences we should be aware of please?
>
> Many thanks
>
> Nathan
>
>> On 27 Oct 2019, at 14:44, Nathan Sowatskey <nathan(a)nathan.to>
<mailto:nathan@nathan.to> wrote:
>>
>> I have had the opportunity to look at this in greater detail now, so I have some
follow up questions, which are below.
>>
>> I should explain that I a bit challenged with this as the reference in the ACM
SOSR paper (
https://dl.acm.org/citation.cfm?id=3314357
<https://dl.acm.org/citation.cfm?id=3314357>) for the datasets is:
>>
>> [3] 2019. Knowledge-Defined Networking.
https://github.com/
<https://github.com/> knowledgedefinednetworking.
>>
>> There are no data sets at that link per se, but you have, I think, explained that
the AM SOSR data sets are at:
>>
>>
https://github.com/knowledgedefinednetworking/Unveiling-the-potential-of-GN…
<https://github.com/knowledgedefinednetworking/Unveiling-the-potential-of-GNN-for-network-modeling-and-optimization-in-SDN/tree/master/datasets>
>>
>> I have, also, just found this link:
>>
>>
https://github.com/knowledgedefinednetworking/NetworkModelingDatasets/tree/…
<https://github.com/knowledgedefinednetworking/NetworkModelingDatasets/tree/master/datasets_v1>
>>
>> Which was created 25 days ago.
>>
>> The v1 datasets includes these topologies:
>>
>> - NSFNet topology:
http://knowledgedefinednetworking.org/data/datasets_v1/nsfnetbw.tar.gz
<http://knowledgedefinednetworking.org/data/datasets_v1/nsfnetbw.tar.gz>
>> - GBN topology:
http://knowledgedefinednetworking.org/data/datasets_v1/gbnbw.tar.gz
<http://knowledgedefinednetworking.org/data/datasets_v1/gbnbw.tar.gz>
>> - GEANT2 topology:
http://knowledgedefinednetworking.org/data/datasets_v1/geant2bw.tar.gz
<http://knowledgedefinednetworking.org/data/datasets_v1/geant2bw.tar.gz>
>> - germany50 topology:
http://knowledgedefinednetworking.org/data/datasets_v1/germany50bw.tar.gz
<http://knowledgedefinednetworking.org/data/datasets_v1/germany50bw.tar.gz>
>>
>> And that these topologies are in the (later) format, used in the ACM SIGCOMM
paper (
https://dl.acm.org/citation.cfm?id=3342327
<https://dl.acm.org/citation.cfm?id=3342327>) that supersedes the ACM SOSR paper
(
https://dl.acm.org/citation.cfm?id=3314357
<https://dl.acm.org/citation.cfm?id=3314357>), and with the code here:
>>
>>
https://github.com/knowledgedefinednetworking/demo-routenet/blob/master/cod…
<https://github.com/knowledgedefinednetworking/demo-routenet/blob/master/code/routenet_with_link_cap.py>
>> So, I *think* that the most up-to-date data sets are the datasets_v1, and that
these datasets_v1 encompass the nsfnetbw and geant2bw topologies from both the ACM SOSR
paper and datasets_v0
(
https://github.com/knowledgedefinednetworking/NetworkModelingDatasets/tree/…
<https://github.com/knowledgedefinednetworking/NetworkModelingDatasets/tree/master/datasets_v0>).
>>
>> What I don’t know is if the delay and jitter data is the same in the datasets_v0,
datasets_v1 and the data sets from the ACM SOSR paper.
>>
>> I can further see that there is a gbnbw topology, which looks like it is the same
as the GBN topology in the ACM SOSR paper.
>>
>> Then we have, in datasets_v1, a germany50bw topology, whereas in datasets_v0 we
have a synth50bw topology. Looking at the topology diagrams, these appear to be quite
different networks. So you seem to have dropped the synth50bw and included the
germany50bw.
>>
>> Questions:
>>
>> Is the delay and jitter data in all three versions of the nsfnetbw and geant2bw
topology data sets the same?
>>
>> Is the gbnbw topology from datasets_v1 the same as the GBN topology from the ACM
SOSR paper, and are the delay and jitter the same?
>>
>> Why do we have a germany50bw instead of the synth50bw topology?
>>
>> Can I generally assume that the datasets_v1 supersedes the previous dataset
versions in the sense that the training data is exactly the same, when the topologies are
the same?
>>
>> Can I generally assume that the later format of the data, in tar.gz files,
supersedes any earlier format?
>>
>> Many thanks
>>
>> Nathan
>>
>>
>>> On 8 Oct 2019, at 18:07, José Suárez-Varela <jsuarezv(a)ac.upc.edu>
<mailto:jsuarezv@ac.upc.edu> wrote:
>>>
>>> Hi Nathan,
>>>
>>> In the ACM SOSR paper we only train the model with 260,000 training samples
from the NSF network topology and evaluate it on 100,000 samples simulated in the GBN and
GEANT2 topologies. You can find the datasets used in this paper at the following link:
>>>
>>>
https://github.com/knowledgedefinednetworking/Unveiling-the-potential-of-GN…
<https://github.com/knowledgedefinednetworking/Unveiling-the-potential-of-GNN-for-network-modeling-and-optimization-in-SDN/tree/master/datasets>
>>>
>>> Please, do not confuse these datasets with the ones that we used in our ACM
SIGCOMM demo paper ("Challenging the generalization capabilities of Graph Neural
Networks for network modeling"), which are on this link:
>>>
>>>
https://github.com/knowledgedefinednetworking/NetworkModelingDatasets/tree/…
<https://github.com/knowledgedefinednetworking/NetworkModelingDatasets/tree/master/datasets_v0>
>>>
>>> We made evaluations (internally) to train the jitter model from scratch and
it works perfectly. However, in the ACM SOSR paper we wanted to show the possibility to
make transfer learning from a model trained (in an early stage) to learn the delay and
retrain it to model the jitter. This typically enables to save training time.
>>>
>>>
>>> Regards,
>>>
>>> José
>>>
>>> El 6/10/19 a las 17:26, Nathan Sowatskey escribió:
>>>> Jose, following up now that I have the ACM version of the paper.
>>>>
>>>> I can see that you are testing with both the GBN and Geant2 networks.
>>>>
>>>> You also appear to say that you train only with the NSF network, and so
you do not train with the synth50bw network. Is that correct?
>>>>
>>>> Also, it looks like you have not trained a jitter model from scratch, as
you explained that the jitter model "was trained from a model previously trained for
the delay”. Training a jitter model from scratch is one of the aspects I should like to
explore, so I wanted to understand this aspect better.
>>>>
>>>> Many thanks
>>>>
>>>> Nathan
>>>>
>>>>> On 25 Sep 2019, at 14:17, Nathan Sowatskey <nathan(a)nathan.to>
<mailto:nathan@nathan.to> wrote:
>>>>>
>>>>> Great, thanks for this. I am trying to get the ACM version of the
paper now.
>>>>>
>>>>> Regards
>>>>>
>>>>> Nathan
>>>>>
>>>>>> On 25 Sep 2019, at 11:55, Jose Suárez-Varela
<jsuarezv(a)ac.upc.edu> <mailto:jsuarezv@ac.upc.edu> wrote:
>>>>>>
>>>>>> Dear Nathan,
>>>>>>
>>>>>> Probably you read our work-in-progress version uploaded at ArXiv.
Please, check the last version published in the proceedings of ACM SOSR
(
https://dl.acm.org/citation.cfm?id=3314357
<https://dl.acm.org/citation.cfm?id=3314357>). Here, we make the evaluation also in
GBN.
>>>>>>
>>>>>> Sorry for the possible misunderstanding. We uploaded the README
page
(
https://github.com/knowledgedefinednetworking/Unveiling-the-potential-of-GN…
<https://github.com/knowledgedefinednetworking/Unveiling-the-potential-of-GNN-for-network-modeling-and-optimization-in-SDN>)
to provide the link to ACM SOSR.
>>>>>>
>>>>>> Regarding the GBN topology, unfortunately we didn't prepare a
figure. However, you can find an image of this topology at the following paper (Figure 4):
>>>>>>
>>>>>> J. Pedro, J. Santos, and J. Pires, “Performance evaluation of
integrated otn/dwdm networks with single-stage multiplexing of optical channel data
units,” in Proceedings of ICTON, 2011, pp. 1–4.
>>>>>>
>>>>>> I hope it will be useful.
>>>>>>
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> José
>>>>>>
>>>>>>
>>>>>> El 25/09/2019 a las 12:14, Nathan Sowatskey escribió:
>>>>>>> Thank you Jose. I have read the paper (many times :-)). I
have seen the details of the evaluation with the Geant2 network, but there is no mention
of the GBN network in the paper.
>>>>>>>
>>>>>>> I am perfectly comfortable with processing the data (you can
see my code here:
https://github.com/Data-Science-Projects/demo-routenet
<https://github.com/Data-Science-Projects/demo-routenet>).
>>>>>>>
>>>>>>> Specifically for the GBN network, I wanted to see what the
topology looks like. I have the NED file, but I can’t use that NED file with OMNet (for
reasons discussed elsewhere).
>>>>>>>
>>>>>>> I can, of course, manually reverse engineer the NED file. But
I wanted to ask if there was already a topology diagram just to save me the effort.
>>>>>>>
>>>>>>> Regards
>>>>>>>
>>>>>>> Nathan
>>>>>>>
>>>>>>>> On 25 Sep 2019, at 11:07, Jose Suárez-Varela
<jsuarezv(a)ac.upc.edu> <mailto:jsuarezv@ac.upc.edu> wrote:
>>>>>>>>
>>>>>>>> Hello Nathan,
>>>>>>>>
>>>>>>>> All these datasets where used in our paper:
>>>>>>>>
>>>>>>>> Krzysztof Rusek, José Suárez-Varela, Albert Mestres, Pere
Barlet-Ros, Albert Cabellos-Aparicio; "Unveiling the potential of Graph Neural
Networks for network modeling and optimization in SDN," in Proceedings of ACM
Symposium on SDN Research (SOSR) , pp. 140-151, April 2019.
>>>>>>>>
>>>>>>>> Particularly, we trained RouteNet only with samples of
the NSFNET dataset to predict the delay and jitter. Then, we evaluate the accuracy of the
models already trained. This evaluation is made separately on the three datasets (NSFNET,
GBN and GEANT2) to test the generalization capability of the model.
>>>>>>>>
>>>>>>>> Please, find more details in Section 4 (Evaluation of the
accuracy of the GNN model) of the paper.
>>>>>>>>
>>>>>>>> Also, you can find information on how to process the
datasets at the following link:
>>>>>>>>
>>>>>>>>
http://knowledgedefinednetworking.org/data/README_gnn.pdf
<http://knowledgedefinednetworking.org/data/README_gnn.pdf>
>>>>>>>>
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>>
>>>>>>>> José
>>>>>>>>
>>>>>>>> El 22/09/2019 a las 16:55, Nathan Sowatskey escribió:
>>>>>>>>> Hi
>>>>>>>>>
>>>>>>>>> On this page:
>>>>>>>>>
>>>>>>>>>
https://github.com/knowledgedefinednetworking/Unveiling-the-potential-of-GN…
<https://github.com/knowledgedefinednetworking/Unveiling-the-potential-of-GNN-for-network-modeling-and-optimization-in-SDN/tree/master/datasets>
>>>>>>>>>
>>>>>>>>> I have seen that there is this data set:
>>>>>>>>>
>>>>>>>>>
http://knowledgedefinednetworking.org/data/GBN.zip
<http://knowledgedefinednetworking.org/data/GBN.zip>
>>>>>>>>>
>>>>>>>>> It is described as having been used for evaluation,
but I can’t find anything else that refers to it.
>>>>>>>>>
>>>>>>>>> Can anyone tell me more please?
>>>>>>>>>
>>>>>>>>> Many thanks
>>>>>>>>>
>>>>>>>>> Nathan
>>>>>>>>> _______________________________________________
>>>>>>>>> Kdn-users mailing list
>>>>>>>>> Kdn-users(a)knowledgedefinednetworking.org
<mailto:Kdn-users@knowledgedefinednetworking.org>
>>>>>>>>>
https://mail.n3cat.upc.edu/cgi-bin/mailman/listinfo/kdn-users
<https://mail.n3cat.upc.edu/cgi-bin/mailman/listinfo/kdn-users>
>>>>> _______________________________________________
>>>>> Kdn-users mailing list
>>>>> Kdn-users(a)knowledgedefinednetworking.org
<mailto:Kdn-users@knowledgedefinednetworking.org>
>>>>>
https://mail.n3cat.upc.edu/cgi-bin/mailman/listinfo/kdn-users
<https://mail.n3cat.upc.edu/cgi-bin/mailman/listinfo/kdn-users>