# `psql`: **PostgreSQL**'s shell client
### **§** General Notes

* We will most often abbreviate “PostgreSQL” as “Postgres,”
  simply because it is easier to say and read.  
  ([**Postgres** _vs_ **PostgreSQL** — _What is the difference?_ | **TablePlus**][PGres])
* “PRISM” is the nickname of EECS's network and collection of computers.
* The class's **Postgres(ql)** server
	* runs on _host_ machine `db.eecs.yorku.ca`
	* is version `11.1`.
* All machines on the PRISM LAN (_local area network_) are hidden
  behind a firewall.
  You must go through the _gateway_ machine
  to access any of them.
* **\*nix** is the general term for any _operating system_
  based in principle on **unix**.
  This covers, of course,
  any of the variants of **unix**,
  and **macOS**
  (for which the core is a **unix** _kernel_ called **darwin**).

[PGres]: https://tableplus.com/blog/2018/10/postgres-vs-postgresql.html (Postgres vs PostgreSQL - What is the difference? | TablePlus)

### **§** How to “talk” to a database

You need a _program_ to set up a connection to the _database server_
and specify the _database_
that the server manages
to which you want to talk
to be able to “interact” with the data
(in that database that that database server is managing).
As we shall discover in the course,
the _only_ way to interact with a _relational_ database
is via the “query language”.
(This is by design!
And we shall be learning why.)
for relational databases,
the standard _query language_ is **SQL**.

Later in the course,
we learn how to write programs
(e.g., in **Java** using **JDBC**)
that can communicate (via **SQL**!)
with a database managed by a database server.
In the case of **Java**,
the **JDBC** library lays out how we do this.

But it is useful to be able to connect _directly_
to a database as a user
to issue (**SQL**) commands and queries “by hand.”
In _operating systems_,
a general _command-line_ program
where the user can issue commands directly
is called a _shell_.
The term _shell_ is used more generally
for any interactive, command-line program
interface to a system.
The `psql` shell is the main tool
for interacting with a database
managed by a **PostgreSQL** server.

Most database systems are a _client-server_ configuration:
the **DBMS** (_data-base management system_) runs as a _server_;
and applications that use and manipulate the data
managed by the **DBMS**
are _clients_.
Most often,
a database client does not run on the same machine as the server.
(A **DBMS** server may have many, many active clients at any given time.)
This will commonly be the case
for `psql` clients.

Are there non-shell clients for **Postgres**,
like pretty GUI (_graphical user interface_) ones?
Yes, of course.
But we do not really need them here.

### **§** The **EECS-3421** **Postgres** set up

For the class,
we use a **Postgres** server running on `db.eecs.yorku.ca`.
Let us nickname this database server **DB**.
An **RDBMS** can manage many different _databases_.
In the class,
we talk about a database as a logically organized, coherent
_schema_ that models a _domain_
and the data that populates the schema.
On the system side of the fence,
a database for the **RDBMS** 
is a data (and schema) _container_
with its own authorizations and permissions
(and resource allocations).

Each student is assigned their own database on **DB**
for use in the course.
The database is named the same as their EECS account name;
EECS account “godfrey” has
a database on **DB** named “godfrey”.

### **§** The **PSQL** shell client

All the machines in PRISM have the `psql` client installed.
To connect to your (or some) _database_ hosted by **DB**,
open a shell on your favourite PRISM machine
and type, say,

% psql -h db -U godfrey -d godfrey

Well, replace “godfrey” with your account (and database) name.
The `-h` argument (“host”) specifies the _host_ machine
where the database server runs.
A `-p` argument (“port”) would specify the _port_ (e.g., `5432`);
more than one database system _could_ be running on the _same_ host.
This value has a default, which is correct here.
The `-U` argument (“user”) specifies the _database user account_.
**DB** has been set up so this mirrors your EECS account.
We can leave out the `-U`;
it will default to be your “EECS account”.
And finally,
the `-d` argument (“database”) says which _database_
you are connecting to.
this will be _your_ database,
which has the same name as your EECS account,
as talked about above.

% psql -h db -d godfrey

replace “godfrey” with _your_ account name.
You will be prompted for a _password_.
We have initialized this to be your York _student number_.
Once you have correctly provided your password,
you will be in the `psql` shell.

What are the commands?!
How do you navigate?
We leave that for you to learn!
[**PostgreSQL**: _Documentation_: _11_: _psql_][psql-11]
to get started.
Note that the shell has a really good built-in _help_ facility.
(To quit,
type “quit” or “^D”.)
And, of course,
we will have to learn **SQL** to make really full use of `psql`!

[psql-11]: https://www.postgresql.org/docs/11/app-psql.html (PostgreSQL: Documentation: 11: psql)

### **§** Setting up drop-through authentication

You can live with having to type in your password
each time you start a `psql`,
of course.
And most people will just deal with it.

But do you _really_ not like having to type in your **Postgres** password
every time you invoke the `psql` shell?
If so,
and if you are so inclined,
here is a way to set up _drop-through authentication_.

#### **¶** the `.pgpass` file

By putting a file named “.pgpass” in your home directory
with perms `600`
(e.g., `chmod u=+rw-x,g=-rwx,o=-rwx .pgpass`)
“on the machine” where your client will run
will give you drop-through authentication
to the `psql` shell
without having to provide your passwd.

The format of `.pgpass` is as follows:




In that line,
“db” is the host machine,
where the **Postgres** server is,
“5432” is the connection port
(on which clients call the server),
the first “godfrey”
is the _database_ hosted by the **Postgres** server
to which we are connecting,
the second “godfrey”
is the _database account_
that we are using,
and “LIES”
is the account's _password_.

In the standard set up for **Postgres**,
the server listens on port `5432` for client connections.
Of course,
in the set up,
one could use a different port for this.
(The server on `db` is using port `5432`.)

You can put in wildcards with “*”.
**PSQL** is matching into the file.


The command-line call still needs you to provide the HOST,
it seems.

% psql -h db

#### **¶** sources  

* [**PostgreSQL**: Use of psql, .pgpass file and PGPASSFILE to store passwords][PGPASS]
* Can be found in the Postgresql documentation too, of course.

[PGPASS]: https://newfivefour.com/postgresql-pgpass-password-file.html (PostgreSQL: Use of psql, .pgpass file and PGPASSFILE to store passwords)

### Using your _own_ **PSQL** client

Want to run your _own_ `psql` client
on your _own_ computer and
connect to the class's **PostgreSQL** server?
that would certainly be overkill
for what you need
for your classwork
in this course.
But we _are_ training data scientists here,
and such skills are more than useful.

#### **¶** install the **PostgreSQL** client

So, if you are so inclined and want to do this,
follow along.
install the **PostgreSQL** _client_ on your computer.
The class's server running at `db.eecs.yorku.ca` is 
version `11.1`,
so install the latest **PostgreSQL 11** client.
(This was `11.8` when I did.)

#### **¶** set up an **SSH** tunnel

% ssh -L 63333:db.eecs.yorku.ca:5432 godfrey@red.eecs.yorku.ca

This opens a shell on `red`.
And this has set port forwarding from `localhost:63333` to `db:5432`
(the connection port for clients for **Postgres**)
from your machine through `red`.

Replace `godfrey` above with _your_ EECS account name,
of course.
Port `63333` is the port on your computer
to be used
by your local `psql` via which to connect.
You can use whatever port number you want
(which is not already in use by something else);
IANA reserves ports `49152` through `65535`
on **\*nix** machines
for “private” use.
And port `5432` _on the other side of the tunnel_
is the standard port that a **PostgreSQL** server
listens on for clients.
(**SSH** is binding your local port `63333` to port `5432` at `db`.)

Why are we wanting to tunnel _through_ `red`?
We cannot connect to `db` _directly_ from outside
of the PRISM local area network (LAN).
The machine `red.eecs.yorku.ca`
is the _gateway_ computer through
which machines outside of PRISM can connect.

#### **¶** and connect your **PSQL** client to EECS's Postgres server

Then, in a second local shell,
we can say

% psql -h localhost -p 63333 -d godfrey

Replace `godfrey` with _your_ database name.
(By convention,
for **EECS-3421**,
we make each person a _database_
of the _same_ name as their EECS account.)

And this has connected your _local client_ `psql`
to `db`'s **Postgres** server!
Your connection from your computer (`localhost`) to `red`
is on an encrypted channel,
via the ssh tunnel.
The connection from `red` to `db` is not in this ssh tunnel.
**Postgres** on `db` sees your client
as if it were on `red`.

Note you are asked for your _passwd_ for `db`'s **Postgres**,
even if you set up `.pgpass` in your home directory on PRISM
Your `psql` client is running locally,
on your machine!

If you are on a __\*nix__ machine (_linux_, _unix_, _macOS_),
you could set up a `.pgpass` in your home directory
on your local computer to allow for pass-through authentication.
Your local `psql` client sees this connection
as if it were to a server running locally (`localhost`).
It is the magic of the ssh tunnel we have running that
tunnels this to the **Postgres** server on `db.eecs.yorku.ca`!
our local `.pgpass` would need to look, for example, like


Note that,
if you are running a local **Postgres** server of your own,
this could get in the way of your connecting to your local server!
So configure things sensibly.

#### **¶** sources

For this,
I consulted and adapted
[PostgreSQL: Documentation: 9.6: Secure TCP/IP Connections with SSH Tunnels][Tunnel].

#### **¶** Windows OS

[Tunnel]: https://www.postgresql.org/docs/9.6/ssh-tunnels.html (PostgreSQL: Documentation: 9.6: Secure TCP/IP Connections with SSH Tunnels)

If you are on a _windows_ machine,
then … good luck!
There is no doubt a way to set this up,
but I have not investigated how yet.

### **§** Using your _own_ **PostgreSQL** server (and **PSQL** client)

**Postgresql** is not too difficult an install
(compared to, say, **IBM DB2** or **Oracle**).
if you are really interested,
have a decent machine,
and are so inclined,
go for it!

### **§** Using a different RDBMS

**Postgres** is not the only _relational database management system_.
Everyone has likely heard of **Oracle**, **IBM DB2**,
and Microsoft's **SQL Server**.
And there are many others.

Projects in the class are on a **PostgreSQL** database system,
And while **SQL** is **SQL** is **SQL**,
there are minor differences from system to system,
enough so to be annoying.
So we do insist that you do your work for this class in **Postgres**.
It is your responsibility to ensure
that what you turn in
works properly on PRISM's **Postgres** (**DB**).

That is not to say you shouldn't practise **SQL**
on another database system if you have one readily available.
It's all good.


|                     |                       |
| **initial author**: | _parke godfrey_       |
|        **created**: | _2020 August 7_       |
|   **last revised**: | _2020 August 13_      |

[PG]: mailto:godfrey@yorku.ca (Mail to "Parke Godfrey" )