INDEX
Explanations
phrases related to predictions or expectations
sentences discussing potential outcomes or evaluations
New Auto-Interp
Negative Logits
illegal
-0.68
ãĤ½
-0.66
Downloadha
-0.66
âĢ
-0.65
Facts
-0.62
ãĤ®
-0.62
è£ıç
-0.61
Dear
-0.61
unlawfully
-0.60
unlawful
-0.60
POSITIVE LOGITS
kinda
0.97
definitely
0.94
hopefully
0.92
mell
0.91
adapt
0.90
fare
0.87
still
0.86
synerg
0.85
complement
0.84
adjust
0.84
Activations Density 0.489%