08. 查询

Pony 提供了一种非常方便的方法，可以使用生成器表达式语法查询数据库。
Pony允许程序员使用原生的Python语法来处理存储在数据库中的对象，就像存储在内存中一样。它使开发变得更加容易。

在编写查询时，你可以使用Python生成器表达式或lambda匿名函数。

使用Python生成器表达式

Pony允许使用生成器表达式作为编写数据库查询的一种非常自然的方式。

Pony 提供了select()函数，它接受 Python 生成器，将其翻译成 SQL，并从数据库中返回对象，翻译的过程在这个StackOverflow question中描述了。

下面是一个查询的例子：

query = select(c for c in Customer
               if sum(o.total_price for o in c.orders) > 1000)

或者，使用属性提升 attribute lifting。

query = select(c for c in Customer
               if sum(c.orders.total_price) > 1000)

在查询中，你可以应用filter()函数：

query2 = query.filter(lambda person: person.age > 18)

也可以根据另一个查询建立新的查询：

query3 = select(customer.name for customer in query2
                if customer.country == 'Canada')

select()函数返回一个Query类的实例，然后可以调用Query对象方法来获取结果，例如：

customer_name = query3.first()

从查询中可以返回实体、属性或元组的任意表达式。

select((c, sum(c.orders.total_price))
       for c in Customer if sum(c.orders.total_price) > 1000)

使用lambda函数

除了使用生成器，你也可以使用lambda函数来编写查询：

Customer.select(lambda c: sum(c.order.price) > 1000)

从将查询转换为SQL的角度来看，如果你使用生成器或lambda，没有什么区别。

唯一的区别是，使用lambda只能返回实体实例--没有办法返回具体的实体属性列表或元组列表。

用于查询数据库的Pony ORM函数

详情请参阅API参考资料中的查询和函数部分。

Pony查询示例

为了演示Pony查询，让我们用Pony ORM发行版中的例子。
你可以在交互模式下自己尝试这些查询，并查看生成的SQL。

为此，我们可以这样导入示例模块：

>>> from pony.orm.examples.estore import *

本模块提供了一个简化的电子商务网店的数据模型，以下是数据模型的ER图：

111.png

下面是实体的定义：

from decimal import Decimal
from datetime import datetime

from pony.converting import str2datetime
from pony.orm import *

db = Database()

class Customer(db.Entity):
    email = Required(str, unique=True)
    password = Required(str)
    name = Required(str)
    country = Required(str)
    address = Required(str)
    cart_items = Set('CartItem')
    orders = Set('Order')

class Product(db.Entity):
    id = PrimaryKey(int, auto=True)
    name = Required(str)
    categories = Set('Category')
    description = Optional(str)
    picture = Optional(buffer)
    price = Required(Decimal)
    quantity = Required(int)
    cart_items = Set('CartItem')
    order_items = Set('OrderItem')

class CartItem(db.Entity):
    quantity = Required(int)
    customer = Required(Customer)
    product = Required(Product)

class OrderItem(db.Entity):
    quantity = Required(int)
    price = Required(Decimal)
    order = Required('Order')
    product = Required(Product)
    PrimaryKey(order, product)

class Order(db.Entity):
    id = PrimaryKey(int, auto=True)
    state = Required(str)
    date_created = Required(datetime)
    date_shipped = Optional(datetime)
    date_delivered = Optional(datetime)
    total_price = Required(Decimal)
    customer = Required(Customer)
    items = Set(OrderItem)

class Category(db.Entity):
    name = Required(str, unique=True)
    products = Set(Product)

set_sql_debug(True)
db.bind('sqlite', 'estore.sqlite', create_db=True)
db.generate_mapping(create_tables=True)

当你导入这个例子时，它将在文件'estore.sqlite'中创建SQLite数据库，并在其中填充一些测试数据，下面你可以看到一些查询的例子：

# 所有美国客户
Customer.select(lambda c: c.country == 'USA')

# 每个国家的客户数量
select((c.country, count(c)) for c in Customer)

# 最大产品价格
max(p.price for p in Product)

# 最大固态硬盘价格
max(p.price for p in Product)
            for cat in p. categories if cat.name =='Solid State Drives')

#最贵的三款产品
Product.select().order_by(desc(Product.price))[:3]

# 缺货产品
Product.select(lambda p: p.quantity ==0)

# 最受欢迎的产品
Product.select().order_by(lambda p: desc(sum(p.order_items.quantity))).first()

# 从未订购过的产品
Product.select(lambda p: not p.order_items)

# 下过几次订单的客户数
Customer.select(lambda c: count(c.order) > 1)

# 最有价值的三个客户
Customer.select().order_by(lambda c: desc(sum(c.orders.total_price)))[:3]

# 订单已发货的客户
Customer.select(lambda c: SHIPPED in c.order.state)

# 没有订单的客户数
Customer.select(lambda c: not c.order)

# 同样的查询，用LEFT JOIN代替NOT EXISTS。
left_join(c for c in Customer for o in c.order if o is None)

# 订购不同药片的客户数
选择(c for c in Customer
         for p in c.order.projects.product.
         if 'Tablets' in p.category.name and count(p) > 1)

你在pony.orm.examples.estore可以找到更多的查询示例。

查询对象方法

详情请参阅API参考资料中的查询结果部分。

在查询中使用日期和时间

可以在查询中用datetime和timedelta进行算术运算。

如果表达式可以用Python计算，Pony会把计算结果作为参数传递到查询中：

select(o for o in Order if o.date_created >= datetime.now() - timedelta(days=3))[:]

SELECT "o". "id", "o". "state", "o". "date_created", "o". "date_shipped",
       "o". "date_delivered", "o". "total_price", "o". "customer".
FROM "订单" "o"
WHERE "o". "date_created">= ?

如果需要用该属性进行操作，我们无法事先计算，这就是为什么这样的表达式会被翻译成SQL的原因：

select(o for o in Order if o.date_created + timedelta(days=3) >= datetime.now())[:]

SELECT "o"."id", "o"."state", "o"."date_created", "o"."date_shipped",
       "o"."date_delivered", "o"."total_price", "o"."customer"
FROM "Order" "o"
WHERE datetime("o"."date_created", '+3 days') >= ?

Pony生成的SQL会因数据库的不同而不同，上面是SQLite的例子，下面是同样的查询，翻译成PostgreSQL：

SELECT "o". "id", "o". "state", "o". "date_created", "o". "date_shipped",
       "o". "date_delivered", "o". "total_price", "o". "customer".
FROM "order" "o"
WHERE ( "o". "date_created" + INTERVAL '72:0:0:0' DAY TO SECOND) >= %(p1)s

如果你需要使用SQL函数，可以使用raw_sql()函数来包含这个SQL片段：

select(m for m in DBVoteMessage if m.date >= raw_sql("NOW() - '1 minute'::INTERVAL"))

使用Pony可以使用日期时间属性，如月、小时等，根据数据库的不同，会被翻译成不同的SQL，提取这个属性的值，在这个例子中，我们得到的是月份属性：

select(o for o in Order if o.date_created.month ==12)

下面是SQLite的翻译结果：

SELECT "o"."id", "o"."state", "o"."date_created", "o"."date_shipped",
       "o"."date_delivered", "o"."total_price", "o"."customer"
FROM "Order" "o"
WHERE cast(substr("o"."date_created", 6, 2) as integer) = 12

而对于PostgreSQL则是：

SELECT "o"."id", "o"."state", "o"."date_created", "o"."date_shipped",
       "o"."date_delivered", "o"."total_price", "o"."customer"
FROM "order" "o"
WHERE EXTRACT(MONTH FROM "o"."date_created") = 12

自动唯一

Pony试图通过在必要的地方自动添加DISTINCT SQL关键字来避免查询结果中的重复，因为重复且有用的查询结果很难遇到。

当有人想要检索具有特定条件的对象时，通常不会期望同一个对象会被多次返回。

同时，避免重复的查询结果也使查询结果更具有可预测性：你不需要从查询结果中筛选出重复的对象。

Pony只有在可能存在重复的情况下才会添加disctinct关键字，让我们考虑几个例子：

1. 检索有条件的对象

Person.select(lambda p: p.age > 20 and p.name == 'John')

在这个例子中，查询不返回重复，因为结果中包含了Person的主键列，因为这里不可能出现重复，所以没有必要添加DISTINCT关键字，Pony也不添加：

SELECT "p"."id", "p"."name", "p"."age"
FROM "Person" "p"
WHERE "p"."age" > 20
  AND "p"."name" = 'John'

2. 检索对象的属性

select(p.name for p in Person)

这个查询结果返回的不是对象，而是它的属性，这个查询结果可能包含重复，所以Pony会在这个查询中加入DISTINCT。

SELECT DISTINCT "p". "name"
FROM "Person" "p"

这种查询的结果通常用于下拉列表，在这里不会出现重复的情况，当你想在这里出现重复的情况时，想出一个真正的用例并不容易。

如果你需要统计同名的人，最好使用聚合查询：

select((p.name, count(p)) for p in Person)

但如果绝对需要获取所有的人名，包括重复的人名，可以使用Query.without_distinct()方法。

select(p.name for p in Person).without_distinct()

3. 使用联接来检索对象

select(p for p in Person for c in p.cars if c.make in ("Toyota", "Honda"))

这个查询可能包含重复的内容，所以Pony使用DISTINCT来消除它们。

SELECT DISTINCT "p"."id", "p"."name", "p"."age"
FROM "Person" "p", "Car" "c"
WHERE "c"."make" IN ('Toyota', 'Honda')
  AND "p"."id" = "c"."owner"

如果不使用DISTINCT，就有可能出现重复，因为查询使用了两个表（Person和Car），但SELECT部分只使用了一个表。

上面的查询只返回Person（而不是他们的车），因此一般情况下，在结果中多次得到同一个人是不可取的。

我们认为，如果没有重复，结果看起来会更直观。

但如果因为某些原因不需要排除重复，你总是可以在查询中添加without_distinct()：

select(p for p in Person for c in p.cars
         if c.make in ("Toyota", "Honda")).without_distinct()

如果查询结果中包含每个人拥有的汽车，用户可能希望看到Person对象重复，在这种情况下，PONY查询就不同了。

select((p, c) for p in Person for c in p.cars if c.make in ("Toyota", "Honda"))

并且在这种情况下，Pony不会在SQL查询中添加DISTINCT关键字。

总结一下：

"所有查询默认情况下不返回重复查询 "这个原则很容易理解，也不会导致意外。
在大多数情况下，这样的行为是大多数用户想要的。
Pony在查询不应该有重复的情况下，不会添加DISTINCT。
方法without_distinct()可以用来强制Pony不消除重复的查询。

可以在查询中使用的函数

下面是可以在生成器查询中使用的函数列表：

avg()
abs()
exists()
len()
max()
min()
count()
concat()
group_concat()
random()
raw_sql()
select()
sum()
getattr()

例如：

select(avg(c.orders.total_price) for c in Customer)

SELECT AVG("order-1"."total_price")
FROM "Customer" "c"
  LEFT JOIN "Order" "order-1"
    ON "c"."id" = "order-1"."customer"

select(o for o in Order if o.customer in
       select(c for c in Customer if c.name.startswith('A')))[:]

SELECT "o"."id", "o"."state", "o"."date_created", "o"."date_shipped",
       "o"."date_delivered", "o"."total_price", "o"."customer"
FROM "Order" "o"
WHERE "o"."customer" IN (
    SELECT "c"."id"
    FROM "Customer" "c"
    WHERE "c"."name" LIKE 'A%'
    )

使用getattr()

getattr() 是一个内置的Python函数，可以用来获取属性值：

例如：

attr_name = 'name'
param_value = 'John'
select(c for c in Customer if getattr(c, attr_name) === param_value)

使用原始SQL

Pony允许在查询中使用原始SQL，关于如何使用原始SQL有两种选择。

使用raw_sql()函数来写一个生成器或lambda查询的一部分。
使用Entity.select_by_sql()或Entity.get_by_sql()方法编写一个完整的SQL查询。

使用raw_sql()函数

让我们来探讨一下使用raw_sql()函数的例子，下面是方案及初始化数据：

from datetime import date
from pony.orm import *

db = Database('sqlite', ':memory:')

class Person(db.Entity):
    id = PrimaryKey(int)
    name = Required(str)
    age = Required(int)
    dob = Required(date)

db.generate_mapping(create_tables=True)

with db_session:
    Person(id=1, name='John', age=30, dob=date(1986, 1, 1))
    Person(id=2, name='Mike', age=32, dob=date(1984, 5, 20))
    Person(id=3, name='Mary', age=20, dob=date(1996, 2, 15))

raw_sql()结果可以作为逻辑表达式处理：

select(p for p in Person if raw_sql('abs("p"."age") > 25'))

raw_sql()结果可以用来进行比较：

q = Person.select(lambda x: raw_sql('abs("x"."age")') > 25)
print(q.get_sql())

SELECT "x"."id", "x"."name", "x"."age", "x"."dob"
FROM "Person" "x"
WHERE abs("x"."age") > 25

另外，在上面的例子中，我们在一个lambda查询中使用raw_sql()，并打印出结果的SQL。正如你所看到的，raw_sql()部分变成了整个查询的一部分。

raw_sql()可以接受$parameters：

x = 25
select(p for p in Person if raw_sql('abs("p"."age") > $x'))

你可以动态地改变raw_sql()函数的内容，并且仍然可以使用里面的参数：

x = 1
s = 'p.id > $x'
select(p for p in Person if raw_sql(s)

另一种使用动态原始SQL内容的方式：

x = 1
cond = raw_sql('p.id > $x')
select(p for p in Person if cond)

你可以在原始SQL查询中使用各种类型：

x = date(1990, 1, 1)
select(p for p in Person if raw_sql('p.dob < $x'))

原始SQL部分里面的参数可以组合：

x = 10
y = 15
select(p for p in Person if raw_sql('p.age > $(x + y)'))

你甚至可以在里面调用Python函数：

select(p for p in Person if raw_sql('p.dob < $date.today()'))

raw_sql()函数不仅可以用于条件部分，还可以用于返回查询结果的部分。

names = select(raw_sql('UPPER(p.name)') for p in Person)[:]
print(names)

['John', 'Mike', 'Mary']

但是当你使用raw_sql()函数返回数据时，你可能需要指定结果的类型，因为Pony不知道结果的类型是什么：

dates = select(raw_sql('(p.dob)')) for p in Person)[:]
print(dates)

['1985-01-01', '1983-05-20', '1995-02-15']

如果你想以日期列表的形式得到结果，你需要指定结果类型：

dates = select(raw_sql('(p.dob)', result_type=date) for p in Person)[:]
print(dates)

[datetime.date(1986, 1, 1), datetime.date(1984, 5, 20), datetime.date(1996, 2, 15)]

在Query.filter()中也可以使用raw_sql()函数：

x = 25
select(p for p in Person).filter(lambda p: p.age > raw_sql('$x'))

它可以在Query.filter()中使用，不需要lambda，在这种情况下，你必须使用实体名称的首字母小写作为别名：

x = 25
Person.select().filter(raw_sql('p.age > $x'))

你可以在一个查询中使用多个raw_sql()表达式:

x = '123'
y = 'John'
Person.select(lambda p: raw_sql("UPPER(p.name) || $x")
                        ==raw_sql("UPPER($y || '123')")

同样的参数名称可以用不同的类型和值多次使用。

x = 10
y = 31
q = select(p for p in Person if p.age > x and p.age < raw_sql('$y'))
x = date(1980, 1, 1)
y = 'j'
q = q.filter(lambda p: p.dob > x and p.name.startwith(raw_sql('UPPER($y)')))
persons = q[:]

你可以在Query.order_by()部分使用raw_sql()。

x = 9
Person.select().order_by(lambda p: raw_sql('SUBSTR(p.dob, $x)'))

如果你在之前的过滤器中使用了相同的别名，则可以不使用lambda，在这种情况下，我们使用默认的别名--实体名称的第一个字母。

x = 9
Person.select().order_by(raw_sql('SUBSTR(p.dob, $x)'))

使用select_by_sql()和get_by_sql()方法

虽然Pony可以将几乎所有用Python编写的条件翻译成SQL，但有时需要使用原始SQL，例如，为了调用存储过程或使用特定数据库系统的方言功能，需要使用原始SQL。

在这种情况下，Pony允许用户以原始SQL的形式写一个查询，方法是把它放在函数Entity.select_by_sql()或Entity.get_by_sql()的内部，作为一个字符串。

Product.select_by_sql("SELECT * FROM Products")

与Entity.select()方法不同，Entity.select_by_sql()方法不返回查询对象，而是返回一个实体实例列表。

参数使用下面的语法传递，" $name_variable "或"$ (Python中的表达式)"，例如：

x = 1000
y = 500
Product.select_by_sql("SELECT * FROM Product WHERE price > $x OR price = $(y * 2)")

当Pony在原始SQL查询中遇到一个参数时，它会从当前帧（来自globals和locals）或从可以作为参数传递的字典中获取变量值。

Product.select_by_sql("SELECT * FROM Product WHERE price > $x OR price = $(y * 2)",
                       globals={'x': 100}, locals={'y': 200})

变量和更复杂的表达式在$符号后指定，将自动计算并作为参数传输到查询中，这使得SQL注入成为不可能。

Pony会自动将查询字符串中的$x用"?"、"%S "或其他参数样式代替，在你的数据库中使用。

如果你需要在查询中使用 $符号（例如，在系统表的名称中），你必须连续写两个$ 符号：$$。