Fraud Detection
In 7.0.0, we have integrated Fraud Detection, A.K.A Speeders and Straightliners detection. This implementation has been done at various levels:
- In AskiaDesign
- In Supervisor
- In SurveyData
For any survey, you can decide to:
- Activate the Speeder detection (automatic / manual)
- Activate the Straightliner detection (automatic / manual)
- Define what settings you want to apply
- Deactivate them completely
Speeders
There are two kinds of Speeders detection:
- Based on the Speed Median
This is the median of the speed of interviews. We take into account the number of clicks and the number of characters filled in open-ended questions. That will be applied to all types of questionnaires (short or long) within the same survey.
The Speed Median calculation is (the number of clicks + the number OK keystrokes / 3) / Duration
The speed is roughly a number of events per minute.
- Based on the Duration Median
This is the median of the duration of interviews in seconds. We take durations from a certain number of interviews, get the median and check if the respondent is below (let’s say 1/10th of the median overall duration).
Straightliners
A Straightliner is someone who fills all the data of a grid in the same column. This is often a Speeder as well. So, the detection can be made in two ways:
- Answers on the same column with a duration check
- Answers on the same column with a case of opposing items/logic
Only some loop types are eligible for Straightliner detection: visible table questions with single-punch questions with a certain number of rows and columns.
Speedy Straightliner
Let’s imagine this question:
| Totally disagree | Disagree | Agree | Totally agree | |
| Service was good | x | |||
| Food was good | x | |||
| Price was good | x | |||
| Location was good | x |
If someone loved (or hated) the restaurant, it’s possible that they clicked all their responses in the same column. So, the important criteria here would be the duration. If the respondent takes two seconds to fill this grid with all answers in the same column, then we can consider it as a speedy straightliner. In this case, we will take the median duration for this specific grid and decide at which speed X times the system will consider it as a ‘Speedy Straightliner’.
Opposing Straightliner
Let’s imagine the same question with one extra item:
| Totally disagree | Disagree | Agree | Totally agree | |
| Service was good | X | |||
| Food was good | X | |||
| Price was good | X | |||
| Location was good | X | |||
| Overall was bad | X |
Clearly, the respondent didn’t pay attention to the answers, and just straightlined through the whole grid. We need to detect and tag these two kinds of Straightliners.
In AskiaDesign
Interface
In AskiaDesign, a new setting appears ‘Straightlining’, on the sub-question of a loop.
It’s a combo box where you have four choices:
AskiaScript
New keywords have been added in order to tackle Speeders and Straightliners manually:
- Survey.SpeedMedian
Returns the median of the speed of interviews. The speed is (the number of clicks + the number of keystrokes / 3) / Duration
Example: Survey.SpeedMedian > 0.2
- Survey.DurationMedian
Returns the median of the duration of interviews in seconds
Example: Survey.DurationMedian > 25
Final pages
When a Speeder or a Straightliner is detected, the respondent will land on a specific final page (in Internet options > Final pages): Speeders/fraudsters, where you can set all relevant actions needed regarding the status of the respondent.
In Supervisor
A new tab, in survey properties, has been added. You can set the necessary settings according to your requirements for Speeders and Straightliners detection.
The first thing to know the setting is hidden by default. To activate it, you need to open the Cca.GeneralSettings table and look for the setting with ID 421. Then, you need to set 1 in Value field (0 to deactivate it).
Default Task Properties
All new surveys added on Supervisor will take the default settings you have set in Extra > Defaults > Task > Survey.
You can find the new tab named as ‘Speeders’ with all settings for Speeders’ and Straightliners’ detection.
All these settings are written in a specific table in CCA database: SurveysExtendedProperties.
Here are the default values from this SQL table:
The ValueIDs are described in later chapters.
Main Speeders Tab
The Speeders Tab has been divided in two parts, one for each type of fraud detection.
Speeders Detection Settings
The Speeders settings can be dynamically changed on the interface, according to the type of detection selected. It concerns the ‘Speed > x times’ that can be changed into ‘Duration < x%’ when selecting the Duration Median detection.
- Speeders tab settings while using the Speed Median
- Speeders tab settings while using the Duration Median
Speeders Settings explanation
‘Behaviour:’ (SurveyExtendedProperties.ValueID = 1)
Indicates if Speeder detection is activated and how. Values in SurveyExtendedProperties:
- 0: disabled
- 1: manual (you need to use routings)
- 2: automatic
‘Using:’ (SurveyExtendedProperties.ValueID = 2)
Indicates if you are using the duration or the speed. Values in SurveyExtendedProperties:
- 0: duration
- 1: speed = (Num of Clicks + Num of keyStrokes/3) / Duration* 60
‘Speed > x Times’ / ‘Duration < x %’ (SurveyExtendedProperties.ValueID = 3)
Indicates what the threshold is to declare that a respondent is a speeder.
- If using speed: this is a factor: somebody will be declared a speeder if their speed is X times the speed median
- If using duration: this is a percentage: somebody will be declared a speeder if their duration is X% of the duration median
‘Detect after:’ (SurveyExtendedProperties.ValueID = 4)
After how many interviews do we recalculate the duration and speed medians? This is a comma separated list of numbers. Eg 10,50 means the medians are recalculated after 10 and 50 interviews. The number of interviews include the completes and the interviews marked as fraudsters.
‘And then every:’ (SurveyExtendedProperties.ValueID = 5)
After the last specified interval in ‘Detect after’, how frequently do you recalculate the medians?
For example:
- 0: never
- 1: every interview
- 100: every 100 interviews
- X: every X interviews
Straightliners Settings explanation
‘Behaviour:’ (SurveyExtendedProperties.ValueID =7)
Indicates if Speeder detection is activated and how. Values in SurveyExtendedProperties:
- 0: disabled
- 1: manual (you need to use routings)
- 2: automatic
‘Detect after:’ (SurveyExtendedProperties.ValueID = 15)
After how many interviews, do we recalculate the opposability of the statements of a grid? This is a comma separated list of numbers. Eg 10,50 means the coefficients are recalculated after 10 and 50 interviews.
‘And then every:’ (SurveyExtendedProperties.ValueID = 16)
After the last specified interval in "Detect After”, how frequently do you recalculate the coefficients.
- 0: never
- 1: every interview
- 100: every 100 interviews
- X: every X interviews
‘Grid Qualifies for straightlining if‘ settings:
‘Rows >=’ (SurveyExtendedProperties.ValueID = 8)
Indicates the minimum number of rows (loop items) we need to run automatic detection of straightlining
‘Columns >=’ (SurveyExtendedProperties.ValueID = 9)
Indicates the minimum number of answers (modalities) we need to run automatic detection of straight-lining
‘Opposed grid detection:’ (SurveyExtendedProperties.ValueID = 10)
To detect if a grid has opposed statement, we use a formula which returns a number between 0 and 1.
- 0 means the grid does not have opposed statements
- 1 indicates the grid has opposed statements.
It’s calculated by counting the average of the highest value – the lowest value / range
‘Straightliner detection (%):’ (SurveyExtendedProperties.ValueID = 11)
This percentage indicates how sensitive the straight-lining detection is for a given interview
‘For Grids WITH Opposed Statements’:
‘Exit After:’ (SurveyExtendedProperties.ValueID = 12)
This is used to indicate after how many grids with opposing statements should we mark an interview as a Straightliner.
‘For Grids WITHOUT Opposed Statements’:
‘Exit After:' (SurveyExtendedProperties.ValueID = 13)
This is used to indicate after how many grids not having opposing statements should we mark an interview as a straightliner. This used in combination with the "Speed < x times" property below.
Speed > X times (SurveyExtendedProperties.ValueID = 14)
If a grid with non-opposed statement is straightlined, we will mark an interview as fraud if its speed is X times the speed median.
In Survey Data
New fields have been added on AskiaXXXXInterview table in order to store all of this new metadata.
- ClickCount (integer): Number of clicks
- Duration (float): Interview duration in seconds.
- Speed (float): Interview speed - number of clicks per minute (roughly).
- FraudInformation (ntext): JSON data about speed and duration medians as well as the list of questions impacted by Straightlining.
There are two new codes for fraudulent detection into the “LastResult” field:
- Speeders: 1006
- Straightliners: 1007
For Straightliners, we also have modified the code in LastSubResultCode field (transforming from short into long). Now, it holds the ID of the question which triggered a Straightliner detection.