Notas
| Metodología
The process followed for the elaboration of the dataset is as follows:
1. Download of the set of videos described in the following article:
Toinon Vigier, Josselin Rousseau, Matthieu Perreira Da Silva, and Patrick Le Callet, "A new HD and UHD video eye tracking dataset," in Proceedings of the 7th international conference on multimedia systems, 2016, pp. 1–6.
2. Coding of a Python program to process each CTU of the luminances of the raw video to compute the statistics described in the dataset content as predictor variables.
3. Coding of a Python program to process the viewer attention annotations in the sequences and their mapping to the different CTUs in their spatial and temporal domain.
4. Perform a matching for each CTU between the variables of steps 3 and 4.
ARCHIVOS
The dataset consists of a single CSV file with a total of 10 variables, 9 of them as input variables for the models that are learnt using the dataset and 1 as a variable to be predicted:
1. pos_row: the relative position (between 0 and 1) of the CTU among all the rows of CTUs.
2. pos_col: the relative position (between 0 and 1) of the CTU among all the columns of CTUs.
3. Center_distance: the relative distance between the center of the CTU and the center of the frame (e.g., the center of the frame is represented as $(0.5, 0.5)$ in relative terms).
4. Mean: the mean value among the luma pixels of the CTU.
5. Median: the median value among the pixels of the CTU.
6. Std: the standard deviation of the pixels inside the CTU.
7. var: the variance value among the pixels of the CTU.
8. Asymmetry: the skweness (or asymmetry) coefficient of the pixels of the CTU.
9. Kurtosis: the kurtosis coefficient of the pixels of the CTU.
10. Interest: the number of viewers that paid attention to the CTU (out of 34 viewers).
|