INDEX
Explanations
specific patterns of words including phrases related to physical and abstract quantities
phrases indicating a high level of certainty or assurance
New Auto-Interp
Negative Logits
help
-0.65
âĢİ
-0.65
atever
-0.65
prising
-0.63
tainment
-0.62
ibliography
-0.62
regrets
-0.61
Helpful
-0.60
åij
-0.59
congratulations
-0.58
POSITIVE LOGITS
rarely
0.83
wont
0.83
seldom
0.80
already
0.77
scarce
0.76
nature
0.76
relies
0.74
relatively
0.74
constantly
0.73
inception
0.73
Activations Density 0.410%