You need an idea.
You need a realistic idea: Can it be described by a number and understood by words?
You need to get these words and numbers: That's your job. (If you do have realistic idea, and and would like to know where to start -or what we can do for you- please contact us.)
Your data should be in correct form. It needs to be in a .csv file with only two columns: your numbers and the texts describing your activity.
First column: Numbers. The first element of the first column is the unit (for instance M$, votes, likes, shares, millions of viewers). Units are important in science and you need one to perform NLQ. Finding your unit might be the first step of the process. Then, you put your numbers: only digit, no characters such as comma, dots for decimals are OK.
Second column: Texts. The first element of the second column is the description of your texts (no comma in the title please). Then, the text come as it is (all characters welcome here). We do the cleaning for you, remove the html tags, perform the lightest stemming (by removing the "s" at the end of words), create a term document matrix, solve the posterior distribution using a Collapsed Gibbs Sampling, draw the per word topic assignment and per document topic proportion, quantify how much is generated by each topic and write you down a report summary. That's our job.
If you made it here, once again: Congratulation! Go to the ds4all.io/nlq main page, upload your .csv, run NLQ and get your results!