Word frequency queries enable you to list the most frequently occurring words in selected sources, nodes, sets and/or annotations. This can be a useful way of identifying themes or concepts—see example .
|
Click the column headings to change the sort order
|
Double-click a word to see all the references
Right-click and create as a node to gather all the references. |
To define the criteria for a word frequency query:
In Navigation View, click on Queries.
On the Main toolbar, click the New button:
| |
Click the Word Frequency Query in this folder option.
The Word Frequency Query dialog box is displayed.
From theSearch in drop-down list, select whether you want to search in text and/or annotations.
From the Of drop-down list, select the items you want to include in the search
Option |
Description |
|
|
All Sources |
Include all sources in the word frequency query. |
Selected Items |
Include selected items in the word frequency query:
The Select Project Items dialog box is displayed.
|
Items in Selected Folders |
Include all the items in selected folders:
|
From the Where drop-down list, you can choose to search items created/modified by any or selected users.
Choose how many words you want to display. In the Display Words panel, select All to display a count for all words in the selected items.
OR
Select the <number> most frequent words—for example, you could display the 100 most frequently occurring words.
If you want to exclude short words from the results, enter a number in the With minimum length field. For example, if you set the minimum length to '3', then one and two-letter words are not shown in the results. By default, this field is set to '1', so that all words are displayed.
If you want to save the query, click the Add to Project check box at the top of the dialog. Enter a name and description in the General tab
Option |
Description |
|
|
Query type |
Displays the type of query you are creating. You cannot change the contents of this field. |
Name |
Enter a name for the query. |
Description |
If required, enter a description of the query. |
Location |
Displays the folder that contains the query. You cannot change the contents of this field. |
Created |
Displays the date and time the query was created. You cannot change the contents of this field. |
Modified |
Displays the date and time the query was last modified. You cannot change the contents of this field. |
To run the query, click the Run button.
After running a query, you can:
Preview all the references for a selected word
When you run a word frequency query, a preview node is created for each word—this enables you to see all references to the word.
To open a preview node:
Create and run a word frequency query.
The results are displayed in Detail View.
Double-click the required word.
A preview node for the selected word is opened in Detail View.
In the preview, you see each occurrence of the selected keyword in context (KWIC). By default,the context is 'narrow', meaning five words on either side. You can expand the context—for more information, refer to Viewing the Coding Context. You can also change the definition of narrow, to show more or less words on each side of the selected word—refer to Setting Application Options (General Tab) for more information. |
|
Save the references for a selected word as a node
To save a preview node as a node in your node system:
Create and run a word frequency query.
The results are displayed in Detail View.
Click the required word.
On the Project menu, click Create As.
Click Create AsNode.
Define the location and name the node.
If you open the node in your current NVivo session, you see each occurrence of the coded keyword in context (KWIC). By default, the context is 'narrow', meaning five words on either side. If you open the node in a new NVivo session, only the coded keyword in each occurrence is shown. In both cases, you can expand the context— for more information, refer to Viewing the Coding Context. You can change the definition of narrow context, so that you see more or less words on either side of the coded keyword—refer to Setting Application Options (General Tab) for more information. |
|
The results are displayed as a summary or tag cloud.
TheSummary tab lists the words from the most to the least frequent, returning the following information about each word:
Length—the number of letters or characters in the word.
Count—the number of times that the word has appeared within the project items searched.
Percentage—the frequency of the word relative to the total words counted.
The Tag Cloud tab displays up to 100 words alphabetically in varying font sizes, where frequently occurring words are in larger fonts.
When determining the frequency of words, NVivo applies the following rules:
Words containing punctuation (such as hyphens, periods and other symbols) are divided into separate words. For example, part-time will be counted as part and time.
Words containing apostrophes (such as can't andI'd) are treated as one word but if the apostrophe is followed by an 's then the s is not included (Tom's would be counted as Tom).
In audio and video transcripts, only words in the Content field are counted—any words in custom transcript fields are not counted.
Stop words are not counted and are not included in the results. Stop words are language-specific; here are the English stop words
"a", "and", "are", "as", "at", "be", "but", "by", "for", "if", "in", "into", "is", "it", "no", "not", "of", "on", "or", "s", "such","that", "the", "their", "then", "there", "these","they", "this", "to", "was", "will", "with"
If your sources are in a non-English language, you can change the text search index language, so that NVivo uses a different list of stop words. Refer to the General tab in Setting Project Properties for more information.
If you want to include stop words in the word frequency query, set the text search index language to 'None'. For more information, refer to the General tab in Setting Project Properties.
When searching text in selected nodes, if a word is coded against multiple nodes, it is counted once for each node. Similarly, if a word has been coded by multiple users to the same node, it is counted once for each user.