INDEX
Explanations
verbs related to motivation and influence
New Auto-Interp
Negative Logits
Bett
-0.40
vacation
-0.40
Schnee
-0.38
College
-0.38
ListFragment
-0.37
summer
-0.37
remb
-0.37
Sommers
-0.37
Vacation
-0.37
Compat
-0.36
POSITIVE LOGITS
Driven
0.96
Driven
0.95
driven
0.90
driven
0.90
Driving
0.87
drive
0.86
drive
0.86
driving
0.85
driving
0.85
drives
0.79
Activations Density 0.021%