INDEX
Explanations
phrases that are introducing or referring to a following information
statements about existence or reality
New Auto-Interp
Negative Logits
dding
-0.72
Mata
-0.61
Baron
-0.60
anton
-0.58
Vul
-0.55
Orn
-0.55
Vag
-0.55
ighton
-0.54
Fine
-0.54
-----
-0.54
POSITIVE LOGITS
relates
1.19
happens
1.00
happened
0.93
stands
0.92
transpired
0.88
unfolded
0.86
unfolds
0.85
appears
0.85
belongs
0.82
alian
0.81
Activations Density 0.069%