INDEX
Explanations
mentions of discussing or dialoguing
phrases that include the word "talking."
New Auto-Interp
Negative Logits
iver
-0.73
hement
-0.67
cffff
-0.66
ridge
-0.64
hews
-0.64
edu
-0.64
falls
-0.63
ursor
-0.62
imon
-0.60
emale
-0.59
POSITIVE LOGITS
about
1.16
ABOUT
0.97
bout
0.89
about
0.85
About
0.78
here
0.75
figur
0.75
specifically
0.72
purely
0.68
About
0.66
Activations Density 0.070%