I’d like to see them take variance (i.e., range of different sentence lengths) into account—my impression is that more-modern writing has a more dynamic mix of long and short sentences, which could skew the stats in an unintuitive way.
Consider two paragraphs that both have a total of 300 words:
Paragraph A has three 100-word sentences.
Paragraph B has two 140-word sentences and four 5-word sentences.
Paragraph B has half the average sentence length of paragraph A, but over 90% of the text of B is comprised of sentences that are significantly longer than any of the sentences of A.
I’d like to see them take variance (i.e., range of different sentence lengths) into account—my impression is that more-modern writing has a more dynamic mix of long and short sentences, which could skew the stats in an unintuitive way.
Consider two paragraphs that both have a total of 300 words:
Paragraph A has three 100-word sentences.
Paragraph B has two 140-word sentences and four 5-word sentences.
Paragraph B has half the average sentence length of paragraph A, but over 90% of the text of B is comprised of sentences that are significantly longer than any of the sentences of A.