INDEX
Explanations
statements about the potential impact of various actions or events
phrases discussing the effects and consequences of various actions or situations
New Auto-Interp
Negative Logits
etsk
-0.76
Guilty
-0.74
opter
-0.69
Style
-0.68
fuck
-0.65
Puzzles
-0.60
english
-0.60
Franch
-0.60
Requirements
-0.60
dar
-0.60
POSITIVE LOGITS
negligible
1.39
minimal
1.22
immense
1.20
profound
1.19
immediate
1.11
considerable
1.11
enormous
1.10
substantial
1.10
undeniable
1.05
measurable
1.03
Activations Density 0.194%