Release qdap Version 2.1.1 · trinker/qdap

CHANGES IN qdap VERSION 2.1.1

BUG FIXES

syllable_count returned the sentence (recycled) in the words column of the
output. This behavior has been fixed. See GitHub issue #188 for details.
syn returned antonyms for some words. This was caused by the dictionary:
qdapDictionaries::key.syn contained antonyms and elemets the were error
messages (character). This has been fixed. Reference issue #190. (Jingjing Zou)
The pres_debates2012 data set contained three errors in speech attribution.
This has been corrected and the turn of talk (tot) as well.
word_stats would throw an error if no poly-syllable words existed. This has
been corrected (reported by Nicolas Turenne).

NEW FEATURES

qdap_df and %&% added to mimic some of the functionality of dplyr's
tbl_df and chaining pipe in a more specific, less flexible, qdap oriented
way.
Text added to view and change the text.var attribute of a data.frame of the classqdap_df`.
cumulative generic method added to view cumulative scores over time.
formality picks up a cumulative method.
polarity picks up a cumulative method.
end_mark picks up a class (end_mark), plot method, and a cumulative
method.
syllable_sum, polysyllable_sum, and combo_syllable_sum pick up a
class, plot method, and a cumulative method.
wfm becomes a generic method currently applied to a text.var that is:
character, factor (coerced to character), or wfdf.
unbag added as a compliment to bag_o_words and friends for undoing string
splitting. A convenience wrapper for paste(collapse = " ").
as.Corpus.TermDocumentMatrix, as.Corpus.DocumentTermMatrix, and
as.Corpus.wfm added to convert a matrix format to a tm::Corpus.
exclude becomes a generic method for various classes. Functionality is the
same but with improved code readability.
check_spelling_interactive, check_spelling, which_misspelled, and
correct allow the user to identify potentially misspelled words and
optionally suggest replacements.
random_data & random_sent added to generate random sentence data sets and
vectors.
comma_spacer added to ensure strings with commas contain a space after them.
check_text added to identify potential problems in text.
replace_ordinal added to convert ordinal representations of 1 through 100 to
strictly ordinal text (e.g., "1st" becomes "first").
A vignette: Cleaning Text & Debugging was added to assist users with
cleaning and debugging problems in qdap.
pronoun_type, and subject_pronoun_type, object_pronoun_type added to
examine usage of subject/object pronouns by grouping variable.

MINOR FEATURES

dplyr's chaining pipe imported for convenience. See
http://www.rdocumentation.org/packages/magrittr/functions/magrittr for details.

IMPROVEMENTS

wfm gains a speedup through generic classes and tm package integration
(strip is no longer used in wfm).
as.tdm.character and as.dtm.character gain a speed boost with a tm
package integration.
Added message to as.data.frame.Corpus for missing end-marks suggesting the
use of: sent.split = FALSE.
as.Corpus familiy of functions didn't necessarily respect document names and
sometimes used numeric sequence instead. The introduction of a reader via
tm::readTabular has fixed this.
sentSplit now gives warnings for text that may contain anomalies such as:
non-ASCII characters, factors, missing punctuation, empty cells, and no
alphabetic characters found.
read.transcript now gives a warning when reading from a .docx file and the
separator (sep) used is still found in the text as this may indicate the
data did not split correctly.
dispersion_plot now takes a named list of vectors of terms as the argument to
match.terms. The vectors are combined as a unified theme named with the
names of the list supplied to match.terms.

CHANGES

as.data.frame.Corpus's default value for sent.split is now FALSE.
The state column in the qdap::DATA2 data-set is now character (previously
factor).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

qdap Version 2.1.1

CHANGES IN qdap VERSION 2.1.1