INDEX
Explanations
expressions related to taking actions or making decisions
expressions of desire or intent
New Auto-Interp
Negative Logits
organised
-0.94
favourite
-0.89
favoured
-0.88
colourful
-0.80
practise
-0.79
util
-0.79
organise
-0.78
recognised
-0.76
travellers
-0.75
pract
-0.75
POSITIVE LOGITS
âĢ
1.15
[/
1.12
âĶ
1.12
----
1.07
âϦ
1.06
âĢ
1.04
--
0.99
**
0.97
––
0.96
-|
0.95
Activations Density 0.669%