This file is indexed.

/usr/share/doc/python3-postgresql/html/_sources/copyman.txt is in python3-postgresql 1.1.0-1build1.

This file is owned by root:root, with mode 0o644.

The actual contents of the file can be viewed below.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
.. _pg_copyman:

***************
Copy Management
***************

The `postgresql.copyman` module provides a way to quickly move COPY data coming
from one connection to many connections. Alternatively, it can be sourced
by arbitrary iterators and target arbitrary callables.

Statement execution methods offer a way for running COPY operations
with iterators, but the cost of allocating objects for each row is too
significant for transferring gigabytes of COPY data from one connection to
another. The interfaces available on statement objects are primarily intended to
be used when transferring COPY data to and from arbitrary Python
objects.

Direct connection-to-connection COPY operations can be performed using the
high-level `postgresql.copyman.transfer` function::

	>>> from postgresql import copyman
	>>> send_stmt = source.prepare("COPY (SELECT i FROM generate_series(1, 1000000) AS g(i)) TO STDOUT")
	>>> destination.execute("CREATE TEMP TABLE loading_table (i int8)")
	>>> receive_stmt = destination.prepare("COPY loading_table FROM STDIN")
	>>> total_rows, total_bytes = copyman.transfer(send_stmt, receive_stmt)

However, if more control is needed, the `postgresql.copyman.CopyManager` class
should be used directly.


Copy Managers
=============

The `postgresql.copyman.CopyManager` class manages the Producer and the
Receivers involved in a COPY operation. Normally,
`postgresql.copyman.StatementProducer` and
`postgresql.copyman.StatementReceiver` instances. Naturally, a Producer is the
object that produces the COPY data to be given to the Manager's Receivers.

Using a Manager directly means that there is a need for more control over
the operation. The Manager is both a context manager and an iterator. The
context manager interfaces handle initialization and finalization of the COPY
state, and the iterator provides an event loop emitting information about the
amount of COPY data transferred this cycle. Normal usage takes the form::

	>>> from postgresql import copyman
	>>> send_stmt = source.prepare("COPY (SELECT i FROM generate_series(1, 1000000) AS g(i)) TO STDOUT")
	>>> destination.execute("CREATE TEMP TABLE loading_table (i int8)")
	>>> receive_stmt = destination.prepare("COPY loading_table FROM STDIN")
	>>> producer = copyman.StatementProducer(send_stmt)
	>>> receiver = copyman.StatementReceiver(receive_stmt)
	>>> 
	>>> with source.xact(), destination.xact():
	...  with copyman.CopyManager(producer, receiver) as copy:
	...   for num_messages, num_bytes in copy:
	...    update_rate(num_bytes)

As an alternative to a for-loop inside a with-statement block, the `run` method
can be called to perform the operation::

	>>> with source.xact(), destination.xact():
	...  copyman.CopyManager(producer, receiver).run()

However, there is little benefit beyond using the high-level
`postgresql.copyman.transfer` function.

Manager Interface Points
------------------------

Primarily, the `postgresql.copyman.CopyManager` provides a context manager and
an iterator for controlling the COPY operation.

 ``CopyManager.run()``
  Perform the entire COPY operation.

 ``CopyManager.__enter__()``
  Start the COPY operation. Connections taking part in the COPY should **not**
  be used until ``__exit__`` is ran.

 ``CopyManager.__exit__(typ, val, tb)``
  Finish the COPY operation. Fails in the case of an incomplete
  COPY, or an untrapped exception. Either returns `None` or raises the generalized
  exception, `postgresql.copyman.CopyFail`.

 ``CopyManager.__iter__()``
  Returns the CopyManager instance.

 ``CopyManager.__next__()``
  Transfer the next chunk of COPY data to the receivers. Yields a tuple
  consisting of the number of messages and bytes transferred,
  ``(num_messages, num_bytes)``. Raises `StopIteration` when complete.

  Raises `postgresql.copyman.ReceiverFault` when a Receiver raises an
  exception.
  Raises `postgresql.copyman.ProducerFault` when the Producer raises an
  exception. The original exception is available via the exception's
  ``__context__`` attribute.

 ``CopyManager.reconcile(faulted_receiver)``
  Reconcile a faulted receiver. When a receiver faults, it will no longer
  be in the set of Receivers. This method is used to signal to the manager that the
  problem has been corrected, and the receiver is again ready to receive.

 ``CopyManager.receivers``
  The `builtins.set` of Receivers involved in the COPY operation.

 ``CopyManager.producer``
  The Producer emitting the data to be given to the Receivers.


Faults
======

The CopyManager generalizes any exceptions that occur during transfer. While
inside the context manager, `postgresql.copyman.Fault` may be raised if a
Receiver or a Producer raises an exception. A `postgresql.copyman.ProducerFault`
in the case of the Producer, and `postgresql.copyman.ReceiverFault` in the case
of the Receivers.

.. note::
 Faults are only raised by `postgresql.copyman.CopyManager.__next__`. The
 ``run()`` method will only raise `postgresql.copyman.CopyFail`.

Receiver Faults
---------------

The Manager assumes the Fault is fatal to a Receiver, and immediately removes
it from the set of target receivers. Additionally, if the Fault exception goes
untrapped, the copy will ultimately fail.

The Fault exception references the Manager that raised the exception, and the
actual exceptions that occurred associated with the Receiver that caused them.

In order to identify the exception that caused a Fault, the ``faults`` attribute
on the `postgresql.copyman.ReceiverFault` must be referenced::

	>>> from postgresql import copyman
	>>> send_stmt = source.prepare("COPY (SELECT i FROM generate_series(1, 1000000) AS g(i)) TO STDOUT")
	>>> destination.execute("CREATE TEMP TABLE loading_table (i int8)")
	>>> receive_stmt = destination.prepare("COPY loading_table FROM STDIN")
	>>> producer = copyman.StatementProducer(send_stmt)
	>>> receiver = copyman.StatementReceiver(receive_stmt)
	>>> 
	>>> with source.xact(), destination.xact():
	...  with copyman.CopyManager(producer, receiver) as copy:
	...   while copy.receivers:
	...    try:
	...     for num_messages, num_bytes in copy:
	...      update_rate(num_bytes)
	...     break
	...    except copyman.ReceiverFault as cf:
	...     # Access the original exception using the receiver as the key.
	...     original_exception = cf.faults[receiver]
	...     if unknown_failure(original_exception):
	...      ...
	...      raise


ReceiverFault Properties
~~~~~~~~~~~~~~~~~~~~~~~~

The following attributes exist on `postgresql.copyman.ReceiverFault` instances:

 ``ReceiverFault.manager``
  The subject `postgresql.copyman.CopyManager` instance.

 ``ReceiverFault.faults``
  A dictionary mapping the Receiver to the exception raised by that Receiver.


Reconciliation
~~~~~~~~~~~~~~

When a `postgresql.copyman.ReceiverFault` is raised, the Manager immediately
removes the Receiver so that the COPY operation can continue. Continuation of
the COPY can occur by trapping the exception and continuing the iteration of the
Manager. However, if the fault is recoverable, the
`postgresql.copyman.CopyManager.reconcile` method must be used to reintroduce the
Receiver into the Manager's set. Faults must be trapped from within the
Manager's context::

	>>> import socket
	>>> from postgresql import copyman
	>>> send_stmt = source.prepare("COPY (SELECT i FROM generate_series(1, 1000000) AS g(i)) TO STDOUT")
	>>> destination.execute("CREATE TEMP TABLE loading_table (i int8)")
	>>> receive_stmt = destination.prepare("COPY loading_table FROM STDIN")
	>>> producer = copyman.StatementProducer(send_stmt)
	>>> receiver = copyman.StatementReceiver(receive_stmt)
	>>> 
	>>> with source.xact(), destination.xact():
	...  with copyman.CopyManager(producer, receiver) as copy:
	...   while copy.receivers:
	...    try:
	...     for num_messages, num_bytes in copy:
	...      update_rate(num_bytes)
	...    except copyman.ReceiverFault as cf:
	...     if isinstance(cf.faults[receiver], socket.timeout):
	...      copy.reconcile(receiver)
	...     else:
	...      raise

Recovering from Faults does add significant complexity to a COPY operation,
so, often, it's best to avoid conditions in which reconciliable Faults may
occur.


Producer Faults
---------------

Producer faults are normally fatal to the COPY operation and should rarely be
trapped. However, the Manager makes no state changes when a Producer faults,
so, unlike Receiver Faults, no reconciliation process is necessary; rather,
if it's safe to continue, the Manager's iterator should continue to be
processed.

ProducerFault Properties
~~~~~~~~~~~~~~~~~~~~~~~~

The following attributes exist on `postgresql.copyman.ProducerFault` instances:

 ``ReceiverFault.manager``
  The subject `postgresql.copyman.CopyManager`.

 ``ReceiverFault.__context__``
  The original exception raised by the Producer.


Failures
========

When a COPY operation is aborted, either by an exception or by the iterator
being broken, a `postgresql.copyman.CopyFail` exception will be raised by the
Manager's ``__exit__()`` method. The `postgresql.copyman.CopyFail` exception
offers to record any exceptions that occur during the exit of the context
managers of the Producer and the Receivers.


CopyFail Properties
-------------------

The following properties exist on `postgresql.copyman.CopyFail` exceptions:

 ``CopyFail.manager``
  The Manager whose COPY operation failed.

 ``CopyFail.receiver_faults``
  A dictionary mapping a `postgresql.copyman.Receiver` to the exception raised
  by that Receiver's ``__exit__``. `None` if no exceptions were raised by the
  Receivers.

 ``CopyFail.producer_fault``
  The exception Raised by the `postgresql.copyman.Producer`. `None` if none.


Producers
=========

The following Producers are available:

 ``postgresql.copyman.StatementProducer(postgresql.api.Statement)``
  Given a Statement producing COPY data, construct a Producer.

 ``postgresql.copyman.IteratorProducer(collections.Iterator)``
  Given an Iterator producing *chunks* of COPY lines, construct a Producer to
  manage the data coming from the iterator.


Receivers
=========

 ``postgresql.copyman.StatementReceiver(postgresql.api.Statement)``
  Given a Statement producing COPY data, construct a Producer.

 ``postgresql.copyman.CallReceiver(callable)``
  Given a callable, construct a Receiver that will transmit COPY data in chunks
  of lines. That is, the callable will be given a list of COPY lines for each
  transfer cycle.


Terminology
===========

The following terms are regularly used to describe the implementation and
processes of the `postgresql.copyman` module:

 Manager
  The object used to manage data coming from a Producer and being given to the
  Receivers. It also manages the necessary initialization and finalization steps
  required by those factors.

 Producer
  The object used to produce the COPY data to be given to the Receivers. The
  source.

 Receiver
  An object that consumes COPY data. A target.

 Fault
  Specifically, `postgresql.copyman.Fault` exceptions. A Fault is raised
  when a Receiver or a Producer raises an exception during the COPY operation.

 Reconciliation
  Generally, the steps performed by the "reconcile" method on
  `postgresql.copyman.CopyManager` instances. More precisely, the
  necessary steps for a Receiver's reintroduction into the COPY operation after
  a Fault.

 Failed Copy
  A failed copy is an aborted COPY operation. This occurs in situations of
  untrapped exceptions or an incomplete COPY. Specifically, the COPY will be
  noted as failed in cases where the Manager's iterator is *not* ran until
  exhaustion.

 Realignment
  The process of providing compensating data to the Receivers so that the
  connection will be on a message boundary. Occurs when the COPY operation
  is aborted.