Nash
November 29, 2022, 1:18pm
1
I have two data frames. The first, A, contains the headings “Timestamp”, “Ticker”, “Price”. The second, B, contains “Timestamp”, “Ticker”, “ProfitMargin”.
How do I join A and B, such that I get:
C = [Timestamp, Ticker, Price, ProfitMargin] for only those Timestamp values that are shared between A and B, and without having duplicate headers?
p-gw
November 29, 2022, 2:21pm
2
In the DataFrames.jl documentation are quite a few examples on how to join two data frames.
https://dataframes.juliadata.org/stable/man/joins/
If I read your example right you want to innerjoin
the two data frames A
and B
.
julia> A = DataFrame(Timestamp = [1, 2, 3], Ticker = ["a", "b", "c"], Price = [1.0, 2.0, 3.0])
3×3 DataFrame
Row │ Timestamp Ticker Price
│ Int64 String Float64
─────┼────────────────────────────
1 │ 1 a 1.0
2 │ 2 b 2.0
3 │ 3 c 3.0
julia> B = DataFrame(Timestamp = [2, 3, 4], Ticker = ["b", "c", "a"], ProfitMargin = [10.0, 20.0, 30.0])
3×3 DataFrame
Row │ Timestamp Ticker ProfitMargin
│ Int64 String Float64
─────┼─────────────────────────────────
1 │ 2 b 10.0
2 │ 3 c 20.0
3 │ 4 a 30.0
julia> C = innerjoin(A, B, on = [:Timestamp, :Ticker])
2×4 DataFrame
Row │ Timestamp Ticker Price ProfitMargin
│ Int64 String Float64 Float64
─────┼──────────────────────────────────────────
1 │ 2 b 2.0 10.0
2 │ 3 c 3.0 20.0
1 Like