KeyError in ORM code when annotating with Case and aggregating annotations into groups as follows


panda face

We had this problem in a very complex project, but I managed to reproduce it in a dummy project:

This is django 1.11.7 on python3.6 and postgres10:

models.py:

from django.db import models


class Thing(models.Model):

    multiplierA = models.IntegerField()
    multiplierB = models.IntegerField()


class Data(models.Model):

    thing = models.ForeignKey('Thing')
    multiplier_choice = models.CharField(max_length=1, choices=(('A', 'use multiplier A'), ('B', 'use multiplier B')))
    option = models.IntegerField(choices=((1, 'option 1'), (2, 'option 2')))
    percentage = models.FloatField()

tests.py:

from django.db.models import Case, F, FloatField, IntegerField, Sum, When
from django.test import TestCase

from .models import Data, Thing


class AnnotateTests(TestCase):

    def test_simple(self):

        thing = Thing.objects.create(multiplierA=2, multiplierB=3)

        Data.objects.create(thing=thing, multiplier_choice='A', option=1, percentage=0.2)
        Data.objects.create(thing=thing, multiplier_choice='A', option=2, percentage=0.3)
        Data.objects.create(thing=thing, multiplier_choice='A', option=3, percentage=0.1)

        Data.objects.create(thing=thing, multiplier_choice='B', option=1, percentage=0.1)
        Data.objects.create(thing=thing, multiplier_choice='B', option=2, percentage=0.4)
        Data.objects.create(thing=thing, multiplier_choice='B', option=3, percentage=0.5)


        whens = [
            When(multiplier_choice='A', then=F('thing__multiplierA')),
            When(multiplier_choice='B', then=F('thing__multiplierB'))
        ]

        multiplier_case = Case(*whens, output_field=IntegerField(), default=0)


        qs = (Data.objects
              # select only certain options to sum up for each thing:
              .filter(thing=thing, option__in=[1, 2])
              # select the correct multiplier
              .annotate(multiplier=multiplier_case)
              # group by thing => sum of percentage * multiplier
              .values('thing')
              .annotate(amount_sum=Sum(F('percentage') * F('multiplier')))
              .values('amount_sum'))

        print(qs.values('thing__id', 'amount_sum'))

Running this test results in the following traceback:

======================================================================
ERROR: test_simple (annotate.tests.AnnotateTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/robin/src/ormweirdness/annotate/tests.py", line 34, in test_simple
    .annotate(amount_sum=Sum(F('percentage') * F('multiplier')))
  File "/Users/robin/.virtualenvs/ormweirdness/lib/python3.6/site-packages/django/db/models/query.py", line 945, in annotate
    clone.query.add_annotation(annotation, alias, is_summary=False)
  File "/Users/robin/.virtualenvs/ormweirdness/lib/python3.6/site-packages/django/db/models/sql/query.py", line 973, in add_annotation
    summarize=is_summary)
  File "/Users/robin/.virtualenvs/ormweirdness/lib/python3.6/site-packages/django/db/models/aggregates.py", line 19, in resolve_expression
    c = super(Aggregate, self).resolve_expression(query, allow_joins, reuse, summarize)
  File "/Users/robin/.virtualenvs/ormweirdness/lib/python3.6/site-packages/django/db/models/expressions.py", line 548, in resolve_expression
    c.source_expressions[pos] = arg.resolve_expression(query, allow_joins, reuse, summarize, for_save)
  File "/Users/robin/.virtualenvs/ormweirdness/lib/python3.6/site-packages/django/db/models/expressions.py", line 412, in resolve_expression
    c.rhs = c.rhs.resolve_expression(query, allow_joins, reuse, summarize, for_save)
  File "/Users/robin/.virtualenvs/ormweirdness/lib/python3.6/site-packages/django/db/models/expressions.py", line 471, in resolve_expression
    return query.resolve_ref(self.name, allow_joins, reuse, summarize)
  File "/Users/robin/.virtualenvs/ormweirdness/lib/python3.6/site-packages/django/db/models/sql/query.py", line 1472, in resolve_ref
    return self.annotation_select[name]
KeyError: 'multiplier'

----------------------------------------------------------------------

I found someone on the django-users mailing list who also seems to have this problem . Unfortunately, no reply.

What's going on here?

panda face

While I still think the above should work (debugging in the ORM?), I've found that moving the multiplication into a Casestatement and using the statement Casedirectly in the statement Sumsolves the problem, since Sumthe annotation lookup is now unnecessary:

whens = [
    When(multiplier_choice='A', then=F('thing__multiplierA') * F('percentage')),
    When(multiplier_choice='B', then=F('thing__multiplierB') * F('percentage'))
]

multiplier_case = Case(*whens, output_field=FloatField(), default=0)


data_qs = (Data.objects
           # select only certain options to sum up for each thing:
           .filter(thing=thing, option__in=[1, 2])
           .values('thing')
           # group by thing => sum of percentage * multiplier
           .annotate(amount_sum=Sum(multiplier_case))
           .values('amount_sum')
           .order_by('thing'))

Generates the following SQL:

SELECT SUM(CASE
               WHEN "annotate_data"."multiplier_choice" = A THEN ("annotate_thing"."multiplierA" * "annotate_data"."percentage")
               WHEN "annotate_data"."multiplier_choice" = B THEN ("annotate_thing"."multiplierB" * "annotate_data"."percentage")
               ELSE 0
           END) AS "amount_sum"
FROM "annotate_data"
INNER JOIN "annotate_thing" ON ("annotate_data"."thing_id" = "annotate_thing"."id")
WHERE ("annotate_data"."thing_id" = 1
       AND "annotate_data"."option" IN (1,
                                        2))
GROUP BY "annotate_data"."thing_id"
ORDER BY "annotate_data"."thing_id" ASC

If someone finds a solution that doesn't mean repeating the whole calculation in every scenario (maybe a lot more complicated than what we're doing) When, I'll accept your answer as the solution :)

Related


KeyError when executing Python code

username I'm running into the following keyError when running a python script that imports data from one csv, modifies it and writes to another csv. Code snippet: import csv Ty = 'testy' Tx = 'testx' ifile = csv.DictReader(open('test.csv')) cdata = [x for

KeyError when executing Python code

username I'm running into the following keyError when running a python script that imports data from one csv, modifies it and writes to another csv. Code snippet: import csv Ty = 'testy' Tx = 'testx' ifile = csv.DictReader(open('test.csv')) cdata = [x for

Returns the user if the user follows each other by aggregating

Duck (First of all, explaining complicated things in English is not my forte. I tried to be as exhaustive as possible so that you might understand my question). I've recently started working on Mongoose for a node.js based web application I'm developing. Typic

Returns the user if the user follows each other by aggregating

Duck (First of all, explaining complicated things in English is not my forte. I tried to be as exhaustive as possible so that you might understand my question). I've recently started working on Mongoose for a node.js based web application I'm developing. Typic

Returns the user if the user follows each other by aggregating

Duck (First of all, explaining complicated things in English is not my forte. I tried to be as exhaustive as possible so that you might understand my question). I've recently started working on Mongoose for a node.js based web application I'm developing. Typic

Returns the user if the user follows each other by aggregating

Duck (First of all, explaining complicated things in English is not my forte. I tried to be as exhaustive as possible so that you might understand my question). I've recently started working on Mongoose for a node.js based web application I'm developing. Typic

Returns the user if the user follows each other by aggregating

Duck (First of all, explaining complicated things in English is not my forte. I tried to be as exhaustive as possible so that you might understand my question). I've recently started working on Mongoose for a node.js based web application I'm developing. Typic

Returns the user if the user follows each other by aggregating

Duck (First of all, explaining complicated things in English is not my forte. I tried to be as exhaustive as possible so that you might understand my question). I've recently started working on Mongoose for a node.js based web application I'm developing. Typic

Returns the user if the user follows each other by aggregating

Duck (First of all, explaining complicated things in English is not my forte. I tried to be as exhaustive as possible so that you might understand my question). I've recently started working on Mongoose for a node.js based web application I'm developing. Typic

Returns the user if the user follows each other by aggregating

Duck (First of all, explaining complicated things in English is not my forte. I tried to be as exhaustive as possible so that you might understand my question). I've recently started working on Mongoose for a node.js based web application I'm developing. Typic

dplyr case_when when across groups

4galaxy7: i have df df = data.frame( group = c(rep("A", 3), rep("B", 3)), vt = c("SO:0001574", "SO:0001619", "SO:0001619", "SO:0001619", "SO:0001619", "SO:0001821") ) and two vectors: tier_1 = c("SO:0001574", "SO:0001575") tier_2 = c("SO:0001821"

dplyr case_when when across groups

4galaxy7: i have df df = data.frame( group = c(rep("A", 3), rep("B", 3)), vt = c("SO:0001574", "SO:0001619", "SO:0001619", "SO:0001619", "SO:0001619", "SO:0001821") ) and two vectors: tier_1 = c("SO:0001574", "SO:0001575") tier_2 = c("SO:0001821"

dplyr case_when when across groups

4galaxy7: i have df df = data.frame( group = c(rep("A", 3), rep("B", 3)), vt = c("SO:0001574", "SO:0001619", "SO:0001619", "SO:0001619", "SO:0001619", "SO:0001821") ) and two vectors: tier_1 = c("SO:0001574", "SO:0001575") tier_2 = c("SO:0001821"

Aggregating groups results in pandas dataframe

Hani Ihlayyle I'm just wondering how to aggregate all the results into a place grouped by a pandas dataframe. data1 = {'id':['1', '1', '2', '2', '2', '3', '3', '3'], 'Age':[27, 24, 22, 32, 33, 36, 27, 32], 'Qualification':['Msc', 'MA', 'M

Aggregating groups results in pandas dataframe

Hani Ihlayyle I'm just wondering how to aggregate all the results into a place grouped by a pandas dataframe. data1 = {'id':['1', '1', '2', '2', '2', '3', '3', '3'], 'Age':[27, 24, 22, 32, 33, 36, 27, 32], 'Qualification':['Msc', 'MA', 'M