Scroll

Askiavista data management and best practices

Follow
Summary This document provides guidance and recommended settings in AskiaVista
Applies to AskiaVista
Written for Vista administrator
Keywords Vista; askiavista; scheduled tasks; settings, inverted; .dat; guidelines; best practise

Askiavista is a multi-user, multi-survey platform that allows simultaneous access from designers/administrators and reporters/users/analysts.
As Vista allows multiple use/ functionality at one time, certain considerations have to be taken into account when managing surveys.
This best practice guide will help you use the software to its fullest potential. This guide is primarily aimed at the Administrators of askiavista.

Let us first detail a few parameters that you will see in the askiavista Administrator interface : 

Inverted data :

Inverted data allow faster reading than "regular" data. While this does not affect small surveys, using inverted data in large surveys will significantly decrease the loading time of a data table for the end-user.
This comes to the cost of actually inverting the data. This operation will last a certain amount of time, depending on the size of the data to invert and on the speed of your CPU.


To speed up the inversion, we have developed an "Inversion type" option. If the already collected interviews are not being edited, or deleted, the "Verify inversion" option will skip all existing interviews and only invert new data, which makes the inversion process faster. Otherwise, the "Full inversion" will be used.

More about inverted data.

SQL server format : 

There are two available SQL formats in Askia Vista : "new SQL format" and "SQL (Legacy)". The first one includes faster data performance and will be preferred in all cases, except if you are running CCA 5.3.1.

 

Scheduled tasks : 

In order to have up-to-date tables and counts without any manual intervention, askiavista offers the possibility to schedule tasks for each survey.
While some tasks will run for a millisecond, some others (such as data inversion) are CPU-intensive, and should be scheduled with extra attention.
This is especially true when scaling and you have more and more accessible surveys on the same askiavista server.

Here is a summary of the available tasks for scheduling.

Invert and reload survey Inverts the data first.  Then, updates the survey information and updates with data with the latest respondent data. If Use inverted data when available is not checked, then using this task is useless and only wastes system resources. Very intensive
Reload survey data Updates the data (only) with the latest respondent data.  Use this when more data has been collected, but there are no changes to the survey. Light
Reload survey Updates the survey information and updates the data with the latest respondent data.  Use this option when the survey has changed and additional respondent data has been collected. Light
Generate survey structure  Only used for askiavista 5.x.  

 

Below is a decision flowchart that contains recommended task setup settings for different types of surveys: 

 

FAQ

Even after applying the recommended settings, my survey is slow or the structure fails to be generated on the AskiaVista application server, what can I do ?

The survey is probably very big in terms of data structure. Here are a few checks that can shrink the size of the data files.

  • From AskiaDesign, spot the variables and loops that aren't required to be displayed in AskiaVista. Uncheck the "Visible in Analyse" checkbox on the useless variables, and uncheck the "Develop level in Analyse" on the useless loops.
  • From AskiaVista's administration interface, deactivate languages that aren't used at the reporting level. Many multi-lingual surveys are only displayed in only 1-language dashboards.

Have more questions? Submit a request

Comments

  • Avatar
    Nicolas MARTY

    Very good page! Thanks a lot stephen!

    1) After few test, if i understand well, verify inversion delete inverted data. So i don't understand the aim of this inversion?

    2) Also, the information of number of interviews & questions are indicated for a good data management but i think it doesn't take care of time of portfolio generation. For exemple, i have a study with 1000 interviews and 300 questions. The porfolio takes more than 1 minutes to generate without inverted data. It takes 3 sec with inverted data.

    Do you have information for times of generation portfolio according to number of interviews and number of questions?

  • Avatar
    Stephen Bronnec

    Hello Nicolas,
    thanks for the comments and questions !

    Regarding point 1),
    "Verify inversion" will scan the SQL server looking for new interview IDs.
    If a new Interview ID is found in the SQL tables, and that no inverted data is found for that ID in the inverted folder, then AskiaVista will perform the actual data inversion for that interview.
    On the other hand, if the inverted folder already contains data for an interview (because it has already been inverted), the data will not be inverted. Neither will it be deleted from the inverted folder. Nothing happens.
    This option can save your server a lot of computing time by not inverting the same interview during each inversion.
    But this is only useful if you know that existing interviews will not be edited by another user. In that particular case, in order for your AskiaVista platform to display the updated interviews, you'd need to allow the server to re-invert the existing interviews by using the "full inversion" option.

    Regarding point 2),
    Apart from the size of your questionnaire (# of questions and interviews) and the data type (inverted or not), the rendering time of portfolios varies greatly depending on these parameters :
    - the computing power of your server. The decision chart above is based on a low/mid range server as of 2014 ( ~16GB RAM, 4 cores, 8 threads )
    - the number of questions that you actually display in your tables. Loops take longer to display depending on their number of iterations.
    - the number of tables that are included in your portfolio
    - the type of data storage. "SQL server (Legacy)" is less optimized than "SQL server (new)"

    Unfortunately, we don't have any rendering time benchmarks on that topic.
    Inverted data is without question the easiest way to boost your portfolio rendering times, and it really seems worth doing it in the example you are describing above !