One way to limit the sheer volume of information and focus attention on words that best characterize James's style is to concentrate on words with very different frequencies in the early, intermediate, and late novels, using the Distinctiveness Ratio (see the Appendix 5 and Hoover, et al., chapter four, for details).
After calculating the average frequencies of the words in the seven early and eight late novels, I calculated the Distinctiveness Ratio between the two and sorted the word list into two groups: words more frequent in the early novels and words more frequent in the late novels.
From the remaining words that occur at least twenty times I selected the 500 with the most variable frequencies, as measured by the total Distinctiveness Ratio (the ratio of the maximum to the minimum frequency plus the ratio of the maximum to the median frequency).