INDEX
Explanations
phrases indicating significant impacts or consequences of events or actions
New Auto-Interp
Negative Logits
bort
-0.51
Alternative
-0.46
Alternative
-0.46
La
-0.45
сен
-0.45
liez
-0.43
McAllister
-0.43
ams
-0.43
alternative
-0.42
perfec
-0.42
POSITIVE LOGITS
impact
1.66
impact
1.54
impacts
1.50
Impact
1.45
effects
1.45
Impact
1.42
effect
1.40
Impacts
1.37
effect
1.36
Impacts
1.36
Activations Density 0.489%