This file is indexed.

/usr/lib/R/site-library/tibble/doc/extending.Rmd is in r-cran-tibble 1.4.1-1ubuntu1.

This file is owned by root:root, with mode 0o644.

The actual contents of the file can be viewed below.

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
---
title: "Extending tibble"
author: "Kirill Müller, Hadley Wickham"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Extending tibble}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

To extend the tibble package for new types of columnar data, you need to understand how printing works. The presentation of a column in a tibble is powered by four S3 generics:

* `type_sum()` determines what goes into the column header.
* `pillar_shaft()` determines what goes into the body of the column.
* `is_vector_s3()` and `obj_sum()` are used when rendering list columns.

If you have written an S3 or S4 class that can be used as a column, you can override these generics to make sure your data prints well in a tibble. To start, you must import the `pillar` package that powers the printing of tibbles. Either add `pillar` to the `Imports:` section of your `DESCRIPTION`, or simply call:

```{r, eval = FALSE}
usethis::use_package("pillar")
```

This short vignette assumes a package that implements an S3 class `"latlon"` and uses `roxygen2` to create documentation and the `NAMESPACE` file.  For this vignette to work we need to attach pillar:


## Prerequisites

We define a class `"latlon"` that encodes geographic coordinates in a complex number. For simplicity, the values are printed as degrees and minutes only.

```{r}
#' @export
latlon <- function(lat, lon) {
  as_latlon(complex(real = lon, imaginary = lat))
}

#' @export
as_latlon <- function(x) {
  structure(x, class = "latlon")
}

#' @export
c.latlon <- function(x, ...) {
  as_latlon(NextMethod())
}

#' @export
`[.latlon` <- function(x, i) {
  as_latlon(NextMethod())
}

#' @export
format.latlon <- function(x, ..., formatter = deg_min) {
  x_valid <- which(!is.na(x))

  lat <- unclass(Im(x[x_valid]))
  lon <- unclass(Re(x[x_valid]))

  ret <- rep("<NA>", length(x))
  ret[x_valid] <- paste(
    formatter(lat, c("N", "S")),
    formatter(lon, c("E", "W"))
  )
  format(ret, justify = "right")
}

deg_min <- function(x, pm) {
  sign <- sign(x)
  x <- abs(x)
  deg <- trunc(x)
  x <- x - deg
  min <- round(x * 60)

  ret <- sprintf("%d°%.2d'%s", deg, min, pm[ifelse(sign >= 0, 1, 2)])
  format(ret, justify = "right")
}

#' @export
print.latlon <- function(x, ...) {
  cat(format(x), sep = "\n")
  invisible(x)
}

latlon(32.7102978, -117.1704058)
```

More methods are needed to make this class fully compatible with data frames, see e.g. the [hms](https://github.com/tidyverse/hms/) package for a more complete example.


## Using in a tibble

Columns on this class can be used in a tibble right away, but the output will be less than ideal:

```{r}
library(tibble)
data <- tibble(
  venue = "rstudio::conf",
  year  = 2017:2019,
  loc   = latlon(
    c(28.3411783, 32.7102978, NA),
    c(-81.5480348, -117.1704058, NA)
  ),
  paths = list(
    loc[1],
    c(loc[1], loc[2]),
    loc[2]
  )
)

data
```

(The `paths` column is a list that contains arbitrary data, in our case `latlon` vectors. A list column is a powerful way to attach hierarchical or unstructured data to an observation in a data frame.)

The output has three main problems:

1. The column type of the `loc` column is displayed as `<S3: latlon>`.  This default formatting works reasonably well for any kind of object, but the generated output may be too wide and waste precious space when displaying the tibble.
1. The values in the `loc` column are formatted as complex numbers (the underlying storage), without using the `format()` method we have defined. This is by design.
1. The cells in the `paths` column are also displayed as `<S3: latlon>`.

In the remainder I'll show how to fix these problems, and also how to implement rendering that adapts to the available width.


## Fixing the data type

To display `<geo>` as data type, we need to override the `type_sum()` method.  This method should return a string that can be used in a column header.  For your own classes, strive for an evocative abbreviation that's under 6 characters.


```{r include=FALSE}
import::from(pillar, type_sum)
```

```{r}
#' @importFrom pillar type_sum
#' @export
type_sum.latlon <- function(x) {
  "geo"
}
```

Because the value shown there doesn't depend on the data, we just return a constant. (For date-times, the column info will eventually contain information about the timezone, see [#53](https://github.com/r-lib/pillar/pull/53).)

```{r}
data
```


## Rendering the value

To use our format method for rendering, we implement the `pillar_shaft()` method for our class. (A [*pillar*](https://en.wikipedia.org/wiki/Column#Nomenclature) is mainly a *shaft* (decorated with an *ornament*), with a *capital* above and a *base* below. Multiple pillars form a *colonnade*, which can be stacked in multiple *tiers*. This is the motivation behind the names in our API.)

```{r include=FALSE}
import::from(pillar, pillar_shaft)
```

```{r}
#' @importFrom pillar pillar_shaft
#' @export
pillar_shaft.latlon <- function(x, ...) {
  out <- format(x)
  out[is.na(x)] <- NA
  pillar::new_pillar_shaft_simple(out, align = "right")
}
```

The simplest variant calls our `format()` method, everything else is handled by pillar, in particular by the `new_pillar_shaft_simple()` helper. Note how the `align` argument affects the alignment of NA values and of the column name and type.

```{r}
data
```

We could also use left alignment and indent only the `NA` values:

```{r}
#' @importFrom pillar pillar_shaft
#' @export
pillar_shaft.latlon <- function(x, ...) {
  out <- format(x)
  out[is.na(x)] <- NA
  pillar::new_pillar_shaft_simple(out, align = "left", na_indent = 5)
}

data
```


## Adaptive rendering

If there is not enough space to render the values, the formatted values are truncated with an ellipsis. This doesn't currently apply to our class, because we haven't specified a minimum width for our values:

```{r}
print(data, width = 35)
```

If we specify a minimum width when constructing the shaft, the `loc` column will be truncated:

```{r}
#' @importFrom pillar pillar_shaft
#' @export
pillar_shaft.latlon <- function(x, ...) {
  out <- format(x)
  out[is.na(x)] <- NA
  pillar::new_pillar_shaft_simple(out, align = "right", min_width = 10)
}

print(data, width = 35)
```

This may be useful for character data, but for lat-lon data we may prefer to show full degrees and remove the minutes if the available space is not enough to show accurate values. A more sophisticated implementation of the `pillar_shaft()` method is required to achieve this:

```{r}
#' @importFrom pillar pillar_shaft
#' @export
pillar_shaft.latlon <- function(x, ...) {
  deg <- format(x, formatter = deg)
  deg[is.na(x)] <- pillar::style_na("NA")
  deg_min <- format(x)
  deg_min[is.na(x)] <- pillar::style_na("NA")
  pillar::new_pillar_shaft(
    list(deg = deg, deg_min = deg_min),
    width = pillar::get_max_extent(deg_min),
    min_width = pillar::get_max_extent(deg),
    subclass = "pillar_shaft_latlon"
  )
}
```

Here, `pillar_shaft()` returns an object of the `"pillar_shaft_latlon"` class created by the generic `new_pillar_shaft()` constructor. This object contains the necessary information to render the values, and also minimum and maximum width values. For simplicity, both formattings are pre-rendered, and the minimum and maximum widths are computed from there. Note that we also need to take care of `NA` values explicitly. (`get_max_extent()` is a helper that computes the maximum display width occupied by the values in a character vector.)

For completeness, the code that implements the degree-only formatting looks like this:

```{r}
deg <- function(x, pm) {
  sign <- sign(x)
  x <- abs(x)
  deg <- round(x)

  ret <- sprintf("%d°%s", deg, pm[ifelse(sign >= 0, 1, 2)])
  format(ret, justify = "right")
}
```

All that's left to do is to implement a `format()` method for our new `"pillar_shaft_latlon"` class. This method will be called with a `width` argument, which then determines which of the formattings to choose:

```{r}
#' @export
format.pillar_shaft_latlon <- function(x, width, ...) {
  if (all(crayon::col_nchar(x$deg_min) <= width)) {
    ornament <- x$deg_min
  } else {
    ornament <- x$deg
  }

  pillar::new_ornament(ornament)
}

data
print(data, width = 35)
```


## Adding color

Both `new_pillar_shaft_simple()` and `new_ornament()` accept ANSI escape codes for coloring, emphasis, or other ways of highlighting text on terminals that support it. Some formattings are predefined, e.g. `style_subtle()` displays text in a light gray. For default data types, this style is used for insignificant digits. We'll be formatting the degree and minute signs in a subtle style, because they serve only as separators. You can also use the [crayon](https://cran.r-project.org/package=crayon) package to add custom formattings to your output.

```{r}
#' @importFrom pillar pillar_shaft
#' @export
pillar_shaft.latlon <- function(x, ...) {
  out <- format(x, formatter = deg_min_color)
  out[is.na(x)] <- NA
  pillar::new_pillar_shaft_simple(out, align = "left", na_indent = 5)
}

deg_min_color <- function(x, pm) {
  sign <- sign(x)
  x <- abs(x)
  deg <- trunc(x)
  x <- x - deg
  rad <- round(x * 60)
  ret <- sprintf(
    "%d%s%.2d%s%s",
    deg,
    pillar::style_subtle("°"),
    rad,
    pillar::style_subtle("'"),
    pm[ifelse(sign >= 0, 1, 2)]
  )
  ret[is.na(x)] <- ""
  format(ret, justify = "right")
}

data
```

Currently, ANSI escapes are not rendered in vignettes, so the display here isn't much different from earlier examples. This may change in the future.


## Fixing list columns

To tweak the output in the `paths` column, we simply need to indicate that our class is an S3 vector:

```{r include=FALSE}
import::from(pillar, is_vector_s3)
```

```{r}
#' @importFrom pillar is_vector_s3
#' @export
is_vector_s3.latlon <- function(x) TRUE

data
```

This is picked up by the default implementation of `obj_sum()`, which then shows the type and the length in brackets. If your object is built on top of an atomic vector the default will be adequate. You, will, however, need to provide an `obj_sum()` method for your class if your object is vectorised and built on top of a list.

An example of an object of this type in base R is `POSIXlt`: it is a list with 9 components.

```{r}
x <- as.POSIXlt(Sys.time() + c(0, 60, 3600)) 
str(unclass(x))
```

But it pretends to be a vector with 3 elements:

```{r}
x
length(x)
str(x)
```

So we need to define a method that returns a character vector the same length as `x`:

```{r include=FALSE}
import::from(pillar, obj_sum)
```

```{r}
#' @importFrom pillar obj_sum
#' @export
obj_sum.POSIXlt <- function(x) {
  rep("POSIXlt", length(x))
}
```

## Testing

If you want to test the output of your code, you can compare it with a known state recorded in a text file. For this, pillar offers the `expect_known_display()` expectation which requires and works best with the testthat package. Make sure that the output is generated only by your package to avoid inconsistencies when external code is updated. Here, this means that you test only the shaft portion of the pillar, and not the entire pillar or even a tibble that contains a column with your data type!

The tests work best with the testthat package:

```{r}
library(testthat)
```

```{r include = FALSE}
unlink("latlon.txt")
unlink("latlon-bw.txt")
```

The code below will compare the output of `pillar_shaft(data$loc)` with known output stored in the `latlon.txt` file. The first run warns because the file doesn't exist yet. 

```{r error = TRUE, warning = TRUE}
test_that("latlon pillar matches known output", {
  pillar::expect_known_display(
    pillar_shaft(data$loc),
    file = "latlon.txt"
  )
})
```

From the second run on, the printing will be compared with the file:

```{r}
test_that("latlon pillar matches known output", {
  pillar::expect_known_display(
    pillar_shaft(data$loc),
    file = "latlon.txt"
  )
})
```

However, if we look at the file we'll notice strange things: The output contains ANSI escapes!

```{r}
readLines("latlon.txt")
```

We can turn them off by passing `crayon = FALSE` to the expectation, but we need to run twice again:

```{r error = TRUE, warning = TRUE}
library(testthat)
test_that("latlon pillar matches known output", {
  pillar::expect_known_display(
    pillar_shaft(data$loc),
    file = "latlon.txt",
    crayon = FALSE
  )
})
```

```{r}
test_that("latlon pillar matches known output", {
  pillar::expect_known_display(
    pillar_shaft(data$loc),
    file = "latlon.txt",
    crayon = FALSE
  )
})

readLines("latlon.txt")
```

You may want to create a series of output files for different scenarios:

- Colored vs. plain (to simplify viewing differences)
- With or without special Unicode characters (if your output uses them)
- Different widths

For this it is helpful to create your own expectation function.  Use the tidy evaluation framework to make sure that construction and printing happens at the right time:

```{r}
expect_known_latlon_display <- function(x, file_base) {
  quo <- rlang::quo(pillar::pillar_shaft(x))
  pillar::expect_known_display(
    !! quo,
    file = paste0(file_base, ".txt")
  )
  pillar::expect_known_display(
    !! quo,
    file = paste0(file_base, "-bw.txt"),
    crayon = FALSE
  )
}
```

```{r error = TRUE, warning = TRUE}
test_that("latlon pillar matches known output", {
  expect_known_latlon_display(data$loc, file_base = "latlon")
})
```

```{r}
readLines("latlon.txt")
readLines("latlon-bw.txt")
```

Learn more about the tidyeval framework in the [dplyr vignette](http://dplyr.tidyverse.org/articles/programming.html).

```{r include = FALSE}
unlink("latlon.txt")
unlink("latlon-bw.txt")
```